Programming with confidence

Programming with confidence

It seemed that for any piece of software I wrote, after a couple of years I started hating it, because it became increasingly brittle and terrifying. Looking back in the rear-view, I’m thinking I was reacting to the experience, common with untested code, of small changes unexpectedly causing large breakages for reasons that are hard to understand.
—Tim Bray, “Testing in the Twenties”

How do we know that the software we write is correct? And, even if it starts out that way, how do we know that the minor changes we make to it aren’t introducing bugs?

One thing that can help give us confidence about the correctness of software is to write tests for it. While tests are useful whenever we write them, it turns out that they’re especially useful when we write them first. Why?

The most important reason to write tests first is that, to do that, we need to have a clear idea of how the program should behave, from the user’s point of view. There’s some thinking involved in that, and the best time to do it is before we’ve written any code.

Why? Because trying to write code before we have a clear idea of what it should do is simply a waste of time. It’s almost bound to be wrong in important ways. We’re also likely to end up with a design which might be convenient from the point of view of the implementer, but that doesn’t necessarily suit the needs of users at all.

Working test-first encourages us to develop the system in small increments, which helps prevent us from heading too far down the wrong path. Focusing on small, simple chunks of user-visible behaviour also means that everything we do to the program is about making it more valuable to users.

Tests can also guide us toward a good design, partly because they give us some experience of using our own APIs, and partly because breaking a big program up into small, independent, well-specified modules makes it much easier to understand and work on.

Self-testing code

What we aim to end up with is self-testing code:

You have self-testing code when you can run a series of automated tests against the code base and be confident that, should the tests pass, your code is free of any substantial defects.
One way I think of it is that as well as building your software system, you simultaneously build a bug detector that’s able to detect any faults inside the system. Should anyone in the team accidentally introduce a bug, the detector goes off.
—Martin Fowler, “Self-Testing Code”

This isn’t just because well-tested code is more reliable, though that’s important too. The real power of tests is that they make developers happier, less stressed, and more productive as a result.

Tests are the Programmer’s Stone, transmuting fear into boredom. “No, I didn’t break anything. The tests are all still green.” The more stress I feel, the more I run the tests. Running the tests immediately gives me a good feeling and reduces the number of errors I make, which further reduces the stress I feel.
—Kent Beck, “Test-Driven Development by Example”

The adventure begins

Let’s see what writing a function test-first looks like in Go. Suppose we’re writing an old-fashioned text adventure game, like Zork, and we want the player to see something like this:

Attic
The attics, full of low beams and awkward angles, begin here in a relatively tidy area which extends north, south and east. You can see here a battery, a key, and a tourist map.

Adventure games usually contain lots of different locations and items, but one thing that’s common to every location is that we’d like to be able to list its contents in the form of a sentence:

You can see here a battery, a key, and a tourist map.

Suppose we’re storing these items as strings, something like this:

a battery
a key
a tourist map

How can we take a bunch of strings like this and list them in a sentence, separated by commas, and with a concluding “and”? It sounds like a job for a function; let’s call it ListItems.

What kind of test could we write for such a function? You might like to pause and think about this a little.

One way would be to call the function with some specific inputs (like the strings in our example), and see what it returns. We can predict what it should return when it’s working properly, so we can compare that prediction against the actual result.

Here’s one way to write that in Go, using the built-in testing package:

func TestListItems_GivesCorrectResultForInput(t *testing.T) {
    t.Parallel()
    input := []string{
        "a battery",
        "a key",
        "a tourist map",
    }
    want := "You can see here a battery, a key, and a tourist map."
    got := game.ListItems(input)
    if want != got {
        t.Errorf("want %q, got %q", want, got)
    }
}

(Listing game/1)

Don’t worry too much about the details for now; we’ll deal with them later. The gist of this test is as follows:

  1. We call the function game.ListItems with our test inputs.
  2. We check the result against the expected string.
  3. If they’re not the same, we call t.Errorf, which causes the test to fail.

Note that we’ve written this code as though the game.ListItems function already exists. It doesn’t. This test is, at the moment, an exercise in imagination. It’s saying if this function existed, here’s what we think it should return, given this input.

But it’s also interesting that we’ve nevertheless made a number of important design decisions as an unavoidable part of writing this test. First, we have to call the function, so we’ve decided its name (ListItems), and what package it’s part of (game).

We’ve also decided that its parameter is a slice of strings, and (implicitly) that it returns a single result that is a string. Finally, we’ve encoded the exact behaviour of the function into the test (at least, for the given inputs), by specifying exactly what the function should produce as a result.

The original description of test-driven development was in an ancient book about programming. It said you take the input tape, manually type in the output tape you expect, then program until the actual output tape matches the expected output.
When describing this to older programmers, I often hear, “Of course. How else could you program?”
Kent Beck

My book The Power of Go: Tests is all about applying this idea to modern software development. It’s full of tips, tricks, and techniques for crafting useful tests, and then parlaying those tests into reliable, well-designed components that we can have confidence in. After all, how else could you program?

Naming something and deciding its inputs, outputs, and behaviour are usually the hardest decisions to make about any software component, so even though we haven’t yet written a single line of code for ListItems, we’ve actually done some pretty important thinking about it.

And the mere fact of writing the test has also had a significant influence on the design of ListItems, even if it’s not very visible. For example, if we’d just gone ahead and written ListItems first, we might well have made it print the result to the terminal. That’s fine for the real game, but it would be difficult to test.

Testing a function like TestItems requires decoupling it from some specific output device, and making it instead a pure function: that is, a function whose result is deterministic, depends on nothing but its inputs, and has no side-effects.

Functions that behave like this tend to make a system easier to understand and reason about, and it turns out that there’s a deep synergy between testability and good design, which we’ll explore in more detail throughout The Power of Go: Tests.

Verifying the test

So what’s the next step? Should we go ahead and implement ListItems now and make sure the test passes? We’ll do that in a moment, but there’s a step we need to take first. We need some feedback on whether the test itself is correct. How could we get that?

It’s helpful to think about ways the test could be wrong, and see if we can work out how to catch them. Well, one major way the test could be wrong is that it might not fail when it’s supposed to.

Tests in Go pass by default, unless you explicitly make them fail, so a test function with no code at all would always pass, no matter what:

func TestAlwaysPasses(t *testing.T) {}

That test is so obviously useless that we don’t need to say any more. But there are more subtle ways to accidentally write a useless test. For example, suppose we mistakenly wrote something like this:

if want != want {
    t.Errorf("want %q, got %q", want, got)
}

A value always equals itself, so this if statement will never be true, and the test will never fail. We might spot this just by looking at the code, but then again we might not.

I’ve noticed that when I teach Go to my students here at the Bitfield Institute of Technology, this is a concept that often gives them trouble. They can readily imagine that the function itself might be wrong. But it’s not so easy for them to encompass the idea that the test could be wrong. Sadly, this is something that happens all too often, even in the best-written programs.

Until you’ve seen the test fail as expected, you don’t really have a test.

So we can’t be sure that the test doesn’t contain logic bugs unless we’ve seen it fail when it’s supposed to. When should the test fail, then? When ListItems returns the wrong result. Could we arrange that? Certainly we could.

That’s the next step, then: write just enough code for ListItems to return the wrong result, and verify that the test fails in that case. If it doesn’t, we’ll know we have a problem with the test that needs fixing.

Writing an incorrect function doesn’t sound too difficult, and something like this would be fine:

func ListItems(items []string) string {
    return ""
}

Almost everything here is dictated by the decisions we already made in the test: the function name, its parameter type, its result type. And all of these need to be there in order for us to call this function, even if we’re only going to implement enough of it to return the wrong answer.

The only real choice we need to make here, then, is what actual result to return, remembering that we want it to be incorrect.

What’s the simplest incorrect string that we could return given the test inputs? Just the empty string, perhaps. Any other string would also be fine, provided it’s not the one the test expects, but an empty string is the easiest to type.

Running tests with go test

Let’s run the test and check that it does fail as we expect it to:

go test

--- FAIL: TestListItems_GivesCorrectResultForInput (0.00s)
    game_test.go:18: want "You can see here a battery, a key, and
    a tourist map.", got ""
FAIL
exit status 1
FAIL    game    0.345s

Reassuring. We know the function doesn’t produce the correct result yet, so we expected the test to detect this, and it did.

If, on the other hand, the test had passed at this stage, or perhaps failed with some different error, we would know there was a problem. But it seems to be fine, so now we can go ahead and implement ListItems for real.

Why don’t you try this yourself, and see what you can do? If your code passes the test, then you can be very confident that it’s correct. And isn’t that a nice feeling?

In Shameless green, we’ll look at one possible solution, and then we’ll see how to use tests to build new behaviours for the function.

Test names should be sentences

Test names should be sentences

Type parameters in Go

Type parameters in Go