Target acquired

Target acquired

Got your bytes and runes twisted? See a doctor immediately.

By its nature, testing is never complete. As the influential computer scientist Edsger Dijkstra put it, “Testing shows the presence, not the absence of bugs.” No quantity of tests can ever prove a package free of bugs. At best, they increase our confidence that the package works well in a wide range of important scenarios.

—Alan Donovan & Brian Kernighan, “The Go Programming Language”

If you feel a bit fuzzy about your Go tests, you’re not alone. Even if you diligently write tests for all your code, there’s always that sneaking doubt: what if there’s some test case you didn’t think of?

In previous posts, we’ve talked about ways to generate test cases randomly, and how to use Go’s fuzz testing facility to automate this process. Now let’s see how to apply fuzz testing to a realistic function, and use it to find a very common Go bug, by writing a fuzz target.

Suppose we want to write some function FirstRune that takes a string and returns its first rune. Given the string "Hello", let’s say, it should return the rune 'H'. That sounds doable, doesn’t it?

In a programming language where there’s a one-to-one correspondence between runes and bytes, this would be straightforward. If the string is s, for example, we can simply index the first byte of the string with s[0] and return it. Job done!

But that won’t work in Go, because that byte may be merely the first part of a multi-byte UTF-8 sequence. Indeed, it can take up to four bytes to represent some Unicode characters. We just won’t know what we’ve got until we look at the actual bytes.

Let’s see if we can use fuzz testing to catch this kind of error, then. We’ll use the fuzzer to generate random inputs, and see if any of them produce the wrong answer from FirstRune.

Here’s an example of such a test:

func FuzzFirstRune(f *testing.F) {
    f.Add("Hello")
    f.Add("world")
    f.Fuzz(func(t *testing.T, s string) {
        got := runes.FirstRune(s)
        want, _ := utf8.DecodeRuneInString(s)
        if want == utf8.RuneError {
            t.Skip() // don't bother testing invalid runes
        }
        if want != got {
            t.Errorf("given %q (0x%[1]x): want '%c' (0x%[2]x)",
                s, want)
            t.Errorf("got '%c' (0x%[1]x)", got)
        }
    })
}

(Listing runes/1)

It might look a little alarming at first, but we’ll break it down, line by line.

The general idea is to call FirstRune with some string (randomly generated by the fuzzer), and compare the result against what the “oracle” (in this case, the standard utf8.DecodeRuneInString function) says it should be.

If our FirstRune function works correctly, it should give the same rune result as DecodeRuneInString for all inputs. But if there’s a bug in the function, its result won’t match the oracle’s, and the fuzz target will mark this as a failing test case.

My book The Power of Go: Tests is all about testing in Go, and it covers fuzz testing, mutation testing, property-based testing, and all sorts of advanced techniques that most Go programmers don’t even know about. If you’re enjoying this excerpt, do check out the full book.

Adding training data with f.Add

While, as we’ve seen, the fuzzer is quite capable of generating random values from scratch, it can be helpful to give it some starting inputs as hints. Remember the mysterious warning it gave us when we ran the fuzz test for the Guess function?

warning: starting with empty corpus

The fuzzer’s corpus is the set of training data we give it, if you like. It can still fuzz even with an empty corpus, as we saw, but it helps if we can provide a few example values for it to start with. These will be supplied to our fuzz target and tested. The fuzzer will then generate more inputs from them by mutating them randomly.

Here’s how we add training data to the corpus:

f.Add("Hello")
f.Add("world")

(Listing runes/1)

We can call f.Add to add more example values as many times as we like. The more training data we can give the fuzzer to suggest possible lines of attack, the better.

The corpus also includes any previously-generated failing test cases stored in the testdata/fuzz folder, like the “21” case in our previous example. So over time we’ll build up a bigger and bigger corpus, which gives the fuzzer more to work with.

A more sophisticated fuzz target

Next comes the call to f.Fuzz with our fuzz target:

f.Fuzz(func(t *testing.T, s string) {
    got := runes.FirstRune(s)
    want, _ := utf8.DecodeRuneInString(s)
    if want == utf8.RuneError {
        t.Skip() // don't bother testing invalid runes
    }
    if want != got {
        ... // fail
    }
})

(Listing runes/1)

First, we’ll call the function and save its result to a variable got, so we can compare it with the expected result. What is the expected result, though, since we don’t know in advance what s is?

Well, we can’t predict it, but we can ask the oracle. DecodeRuneInString will give us the right answer, and we’ll store it in the want variable.

Now, we know that the fuzzer is going to generate lots of random strings for us to use as input, but are all possible strings actually valid input to FirstRune?

No, because not all strings encode valid UTF-8 runes. For example, the empty string doesn’t. And there are many sequences of arbitrary bytes that don’t happen to be valid UTF-8 data. So we need to do a little pre-filtering on our input, before checking whether FirstRune actually handled it correctly.

Suppose the fuzzer calls our fuzz target with some string that doesn’t contain valid runes. What will our oracle function DecodeRuneInString return in this case?

The answer is the constant utf8.RuneError. If we get that result, then this particular input value is invalid, and we should just skip it. That’s what the call to t.Skip does:

if want == utf8.RuneError {
    t.Skip() // don't bother testing invalid runes
}

(Listing runes/1)

A skipped test neither fails nor succeeds; the fuzzer just ignores it and moves on to the next randomly-generated input.

Why don’t we skip the test before calling FirstRune with the input? Because even though we don’t require FirstRune to give the right answer for invalid strings, we do require that it doesn’t panic. So if any fuzz input makes FirstRune panic, we want to find that out before we skip.

Finally, just like in any test, we compare want and got, and fail the test if they’re not equal. The failure message is a little more complicated than any we’ve written so far:

t.Errorf("given %q (0x%[1]x): want '%c' (0x%[2]x)", s, want)
t.Errorf("got '%c' (0x%[1]x)", got)

(Listing runes/1)

Let’s break it down. In the event of a failure, what information would we like to print? Certainly the input string that caused the failure, so that’s the first piece of data:

given %q

Even though this string is guaranteed to be valid UTF-8, we can’t guarantee that all its characters will be printable, or have a corresponding glyph in the display font we’re using.

So just in case the string contains some non-printing runes, we’d also like to print out its raw bytes, as numbers. How can we do that?

As you probably know, the fmt verb %x will print a piece of data as a hexadecimal value. And we can give the reader a clue that what follows is a hex value by prefixing it with “0x”, which denotes a hex literal in Go.

So a format string like this would do the job:

t.Errorf("given %q (0x%x) ... ", s, s)

For example, if s were the string “hello”, and the test failed, this would print:

given "hello" (0x68656c6c6f) ...

But notice that we had to supply the argument s twice, which seems wasteful. As you know, every % verb in the format string has to correspond to a piece of data, supplied as a subsequent argument to t.Errorf.

So if there are two verbs (%q and %x), then we need two data arguments, even though they’re actually the same value. Is there a way to avoid repeating s in this case, then?

Yes. We can use what’s called an explicit argument index, which is a number in square brackets that goes between the % and the x in %x:

%[1]x

This is interpreted by the formatting machinery as meaning “print the Nth data value”. In this case, that’ll be number 1, the first data value, which is s.

We can use the explicit-index trick to repeat a given argument any number of times in our format string:

fmt.Printf("%s %[1]s %[1]s %[1]s", "hello")
// Output:
// hello hello hello hello

In the fuzzing example, we need to use s twice, once to print it out as a quoted string (%q), and again as a hex value (%x). So here’s how we do that:

t.Errorf("given %q (0x%[1]x) ...", s, ...)

So far, so good. What about the next part of the format string? Since both want and got in this case are characters, we’ll need to use the verb %c to display them.

And, again, we’ll use the explicit argument index (in this case, index 2) to re-use the same data, but this time printing it as a hex value (%x).

t.Errorf("... want '%c' (0x%[2]x)", ..., want)

Here’s what that would look like if the test failed:

... want 'w' (0x77)

The final part of the failure message prints got in the same way:

t.Errorf("got '%c' (0x%[1]x)", got)

And the result:

got 'v' (0x76)

Using the fuzzer to detect a panic

So we’re ready to write a null implementation of FirstRune and try out our test. We know it won’t pass, and that’s the point. As you’ll recall, if we’re trying to build a bug detector, we need to see it detect a bug.

Let’s write one:

func FirstRune(s string) rune {
    return 0
}

(Listing runes/1)

This is definitely wrong, so let’s see if the test agrees:

go test -fuzz .

fuzz: elapsed: 0s, gathering baseline coverage: 0/2 completed
failure while testing seed corpus entry: FuzzFirstRune/seed#1
fuzz: elapsed: 0s, gathering baseline coverage: 0/2 completed
--- FAIL: FuzzFirstRune (0.02s)
    --- FAIL: FuzzFirstRune (0.00s)
        runes_test.go:22: given "world" (0x776f726c64): want 'w'
        (0x77)
        runes_test.go:24: got '' (0x0)

That’s encouraging, so let’s see what our carefully-constructed failure message actually reports:

given "world" (0x776f726c64): want 'w' (0x77)
got '' (0x0)

So the failing input is “world”, which was one of the cases we manually added to the corpus using f.Add. What caused the failure, then?

The oracle thinks that the first rune of "world" is w, and I agree. But what FirstRune actually returned in this case was the rune value 0x0, which corresponds to a non-printing character.

This is exactly what we expected to happen, since we deliberately wrote FirstRune so that it always returns zero, no matter what.

Great! We now know that the test can detect at least this bug, where the function does nothing at all. That’s a good start. In the next post, we’ll fix that bug by writing a real implementation of FirstRune, but we’ll also introduce a new, more subtle bug, and see if we can fix this fuzzed-up function. Catch you later!

Fixing a fuzzed-up function

Fixing a fuzzed-up function

What the fuzz?

What the fuzz?