A Fresh Approach to TDD in Hadron

A few weeks ago I watched a talk on Test-Driven Development (TDD) on YouTube that got me thinking about my approach to testing in Hadron. I had a few issues with this talk around gender-inclusive language so I won’t link it here. It’s 2022 and if you’re still using the term “guys” and he/him pronouns to describe a cohort of programmers to my ears it sounds willfully harmful and hopelessly outdated.

More recently, I read Hillel Wayne’s newsletter about TDD and it resonated with my thinking about TDD as well as my recent development work around testing in Hadron.

My experiences with TDD are pretty mixed. I remember first hearing about it sometime in the early 2000s and experimenting with it on smaller personal projects. I liked how TDD requires you to plan on testing your code right from the start. I bought the argument that this lead to better designs, and have advised others along these lines when asked. I find it easy to get lost in minutia when contemplating large complex software designs, and clearly articulating the requirements for the software, along with devising a plan on how to verify those requirements are met, can provide a concrete foundation for building on. In fairness, this is as much an observation about the value of designing from requirements as it is specification testing.

Working on well-tested code bases can be rewarding and fun. The confidence that comes from testing can remove a lot of the concerns around unintended consequences of refactors and redesigns. In this ideal situation, the time and energy invested in building and maintaining the testing infrastructure is a net positive. Good tests can serve as a machine-enforceable specification for the software, and the team catches a lot of bugs early in development. In the right conditions, this creates a virtuous cycle; the benefits of good testing are obvious to everyone on the team so they sustain that investment over time.

A negative feedback loop around testing is equally possible, however. Flaky tests and “change detector” tests that focus on testing implementation details instead of requirements add serious drag to a project, and can radically alter the value proposition around testing. Code bases sometimes change hands without communication around testing practices, leaving the new team guessing about the testing policies and procedures. A variety of external pressures may also influence an engineering team’s attitudes about sustained investment in testing.

Development on Hadron is challenging. It’s my first programming language implementation. I’ve struggled to devise an incremental approach to testing the software as I’ve developed it. The core of a language is necessarily a rat’s nest of dependencies, with a nucleus of functionality that is difficult to subdivide into individual pieces for verification.

For a few months, I had been working on getting Hadron to compile enough of the class library that I could use UnitTest. I also have maintained unit test coverage on a small number of the more complex classes in the code base. I had dropped unit tests for a lot of the compilation pipeline because I’ve changed the design many times, there are many dependencies between the stages, and maintaining testing until the APIs between the pieces stabilized just didn’t feel effective. In all honestly, I think I also dropped unit testing because I had fallen into a common trap of implementation testing with a lot of my testing, instead of testing against requirements, reducing the value of the testing.

The UnitTest class is designed for testing SuperCollider classes and naturally assumes a working language implementation. I wanted to build a language validation suite based on UnitTest, but found no incremental way to develop against that validation. In other words, I needed a working language to verify I had a working language with UnitTest. There are a lot of other projects in Hadron that I’d also put on hold waiting for some kind of validation testing signal. For example, automatic code styling, continuous integration, and even some ideas for performance optimization in the compiler and benchmarking were all projects that felt more appropriate to pursue in the presence of robust testing.

I’ve started working on LLVM at work recently and found their integration testing infrastructure LIT inspirational for ideas about how to test a language implementation. These are integration tests that test the clang binary, for example, on specific source code inputs and then compare the output against some reference. This emphasis on testing against expected interpreter behavior on certain source code inputs helped to re-frame my thinking around testing against requirements and specifications instead of testing implementation details.

My realization that my testing would be more productive against a language specification also lead me to the conclusion that I should organize a formal language specification for SuperCollider. I documented some of the specifications in the Hadron design document. The SuperCollider documentation also has some good details about certain language features, but they’re distributed across guides, tutorials, references, and topical articles on individual classes. The SuperCollider Book also has a lot of language specification information, but much of this is mixed with implementation details about the official interpreter.

A week or so ago I checked in the testing framework and a few failing tests around integer parsing, and then this morning I checked in the code that passes the test. I think writing the specifications and testing against them brings a new focus and vitality to my Hadron work. I do plan on writing the tests firsts, and supporting refactors and optimization work with continuous testing. I’ve felt a little more comfortable deprecating some of the cruft that I know I’ll need later but needs a redesign and isn’t supporting the current integration testing. This process probably wouldn’t pass muster with a TDD purist, but in short:

I found the reminder to focus on testing against specification instead of implementation helpful for refocusing my effort on meaningful development on Hadron
Integration testing seems like a better approach for testing much of Hadron, unit testing had me overly focused on writing “change detector” style tests, or avoiding testing entirely
Organizing the language specification feels like it could have other uses supporting language development and tooling down the road