Suite smells: testing legacy code
A friend had just started a consulting gig with a new team, so I asked him, “How are they doing?” He said, “They’re writing legacy code, man.”
—Michael Feathers, “Working Effectively with Legacy Code”
The Power of Go: Tests is a compilation of all the lessons I’ve learned over a lifetime of software engineering: it’s about how to write robust, reliable, and above all correct programs.
Writing great tests doesn’t guarantee success by itself, of course,
but it’s a very good start. The book has lots of tips about useful
testing techniques, lesser-known features of Go’s testing
package, testing error handling, designing good test cases, fuzz and
mutation testing, test-driving CLI tools, and how to avoid falling into
the trap of writing mocks.
This is the first in a three-part series of excerpts from the book, focusing on test suites in Go, what kind of problems you might find with them, and some tips on how to fix them.
- Testing legacy code
- Undertesting and overtesting
- Slow, flaky, and failing (coming soon)
When you’re new to a project, or even when you’re not, it’s often worth stepping back and taking a look at the test suite as a whole. Consider it as a program in itself, if you like. What kind of shape is it in? Is it well-structured, well-maintained, and easy to work on? Or is it one of the other kind?
Are there no tests at all? Insufficient tests? Are you trying to maintain legacy code that’s really hard to test? If so, I can relate: I’ve worked on lots of projects like this, both as an engineer and as a consultant. And believe it or not, there is hope. Problems can be fixed! Let’s see how.
No tests
One problem that you’re likely to see that affects a good many projects, even today, is a complete lack of tests. If you’re relatively new to the industry, this might surprise you: “Doesn’t everyone write tests? Isn’t it considered best practice?”
Sure. But best practice is what a company says it does. What actually goes on is another matter. In the startup phase, speed is more important than maintainability. Assuming the product makes it to market, the business focus will naturally be on rapid growth, rather than building solid foundations for the long term.
Eventually, though, the cost of adding anything to an untested codebase becomes unmanageable. Indeed, the cost of touching the code at all becomes so high that everyone starts finding excuses not to do it. What can we do?
The answer is pretty simple: write some tests! You don’t need to get permission. Just go ahead and do it. If you’re tasked with touching some small piece of the code that isn’t tested, add a test, then complete your task.
Sometimes people feel nervous about doing this, because they haven’t obtained everybody’s buy-in, perhaps from a company-wide conversation about whether or not testing is a good idea. Don’t worry about this. It is a good idea. Just go do it.
You won’t necessarily be praised for doing this, but it will make your job easier, and your colleagues will appreciate it, too. The idea might even catch on.
The development teams at many companies are focused on high-level business goals, lack any direct incentive to improve code quality, and perceive an investment in code quality to be at odds with shipping on-time… No one is paying, rewarding, or pressuring them to maintain a high level of code quality.
—Mike Bland, “Goto Fail, Heartbleed, and Unit Testing Culture”
Most developers, once they’ve experienced the joy and the lightness of heart that comes from working on well-tested code, become enthusiastic advocates for it themselves.
Once you’ve worked on a system with extensive automated tests, I don’t think you’ll want to go back to working without them. You get this incredible sense of freedom to change the code, refactor, and keep making frequent releases to end users.
—Emily Bache, “The Coding Dojo Handbook”
Legacy code
Adding tests to a codebase that doesn’t have them can be a daunting prospect, but let’s take it one step at a time. You only need to test what you touch.
Suppose you have some monster function, and you need to make a change to line 1337 of it. The prospect of writing a test for the entire function fills you with horror and sadness (I’ve been there). Instead, use the divide-and-conquer technique.
See if a small block of lines around where you’re working can be extracted into its own little function. If so, do this, add a little test for the little function, and make your change, leaving the rest of the code alone. Repeat this process every time you touch some part of the system.
Over time, tested areas of the code base surface like islands rising out of the ocean. Work in these islands becomes much easier. Over time, the islands become large landmasses. Eventually, you’ll be able to work in continents of test-covered code.
—Michael Feathers, “Working Effectively with Legacy Code”
The nice thing about this strategy is that any effort you put into test writing will automatically end up right where it’s needed: those parts of the system where you’re regularly making changes. Of course tests for everything would be great, but we have to start somewhere, and right now, that’s here.
If you have a big system, then the parts that you touch all the time should be absolutely rock solid, so you can make daily changes confidently. As you drift out to the periphery of the system, to parts that don’t change often, the tests can be spottier and the design uglier without interfering with your confidence.
—Kent Beck, “Test-Driven Development by Example”
But what if you can’t add a test for something because the function is untestable? You could refactor it to be testable, but that’s not safe, because you don’t have a test. It’s Catch-22.
This dilemma is characteristic of legacy systems, otherwise known as “here be dragons” code:
Legacy code is code without tests, but more importantly, it’s code that isn’t designed with testability in mind. Such code tends to be riddled with temporal coupling and indirect input. It houses a multitude of monster methods, and its components are intertwined in a bizarre web of dependencies that have spun out of control. Adding any kind of test to such code may be no small feat.
—Alexander Tarlinder, “Developer Testing: Building Quality Into Software”
Making changes to a system like this is risky, but making them without the support of tests is even more dangerous:
Changes in a system can be made in two primary ways. I like to call them “Edit and Pray” and “Cover and Modify”. Unfortunately, “Edit and Pray” is pretty much the industry standard.
—Michael Feathers, “Working Effectively with Legacy Code”
Can we do better? Here’s one emergency rescue procedure for untestable functions that I’ve used successfully. Ignore the existing function and write a test for the function you’d like to have. Then make that test pass by writing a new function from scratch. Finally, delete the old function.
Legacy systems aren’t much fun to deal with, but on the plus side, they usually work pretty well. That’s how they got to be legacy systems. Whatever bugs remain are at least well understood, and people know how to work around them.
We usually don’t need to make many changes to legacy systems, and when we do, a combination of testing and refactoring can make it safe. The important thing to remember about legacy systems is to avoid writing more of them.
The system turns to spaghetti really, really fast. Complexity isn’t one mistake you make, it’s hundreds or thousands of mistakes made by many people over a period of time. You don’t notice it happening, and once it’s happened, it’s almost impossible to fix. It’s so overwhelming, you never get to it.
—John Ousterhout, “A Philosophy of Software Design”
Insufficient tests
Some systems have tests, but not enough of them, or they’re not very good. What can we do in this case? Apart from steadily growing the test suite by applying the “test what you touch” strategy outlined in the previous section, can we do anything to fix the underlying causes of inadequate testing?
For example, maybe a developer is just in a big hurry, and skips writing a test because they feel it will slow them down. Sure, but sometimes steering is more important than speed. If the program doesn’t work, does it really matter how fast you can write it?
You don’t save time if you save on quality. Because it comes back. It’s like when someone asks me “I need to lose a couple of pounds, do you have a tip for me?” And I say, “Yeah. Donate blood.”
—Klaus Leopold, “Blockers: Treasures of Improvement”
Of course, if someone is new to writing tests, it will be a slow process at first, because they don’t know what to do. Once you have a little experience, testing becomes much easier and faster. Half the battle is just having the confidence to know that you can test anything, if you put your mind to it.
A good way to build this confidence is to agree with your team that for a short trial period, you’ll write tests for everything you do. If anyone gets stuck on how to test something, the whole team will pitch in to help.
When I work with teams, I often start by asking them to take part in an experiment. For an iteration, we try to make no change to the code without having tests that cover the change. If anyone thinks that they can’t write a test, they have to call a quick meeting in which they ask the group whether it is possible to write the test.
The beginnings of those iterations are terrible. People feel that they aren’t getting all the work done that they need to. But slowly, they start to discover that they are revisiting better code. Their changes are getting easier, and they know in their gut that this is what it takes to move forward in a better way.
—Michael Feathers, “Working Effectively with Legacy Code”
Sometimes tests end up being insufficient or ineffective because they’re written last. As you’ll know if you’ve read The Power of Go: Tests, there are many important advantages to writing tests first. The most persuasive of those advantages for me is that it just makes my work easier.
However, some people don’t feel this way, and that’s okay. Instead, they find it easier to implement the behaviour first, and then figure out how to test it. If the result is a correct program with robust tests, then who cares what order those things are written in?
For some people, [writing tests first] works fine. More power to ’em. It almost never works for me in a pure form, because my coding style tends to be chaotic in the early stages, I keep refactoring and refactoring the functions all the time.
There’s room for argument here, none for dogma. If your engineers are producing code with effective tests, don’t be giving them any static about how it got that way.
—Tim Bray, “Testing in the Twenties”
It’s always tempting when you find a methodology that works well for you to assume that it’s the right choice for everyone. But people are just too diverse for that to be true. There’s no One Right Way to develop software (or anything else).
I write tests first myself, as I mentioned. That suits me, and many others, and it’s the method I teach and recommend to my students. But it’s not a moral issue. There’s no point trying to shame people because they don’t write their tests first, or getting into arguments about how “my TDD is purer than yours”.
Instead, as Tim Bray suggests, judge by results. If someone is producing great, well-tested code by writing tests last, more power to ’em! Some very smart people work this way, so there must be something to it.
On the other hand, if the code isn’t good, and the tests tend to be feeble, optimistic, or missing altogether, one of the contributory factors may be that the developers are trying to write tests last. Switching to a test-first style, though unfamiliar at first, may improve their results.
In the next post, we’ll talk about what can go wrong with the tests themselves. They can be too optimistic (not testing enough), or they can be persnickety (testing too much, or testing the irrelevant). See you there!