Fixing a fuzzed-up function
There are bugs in your program; I guarantee it.
We can’t prove anything in science. But the best that scientists can do is fail to disprove things while pointing to how hard they tried.
—Richard Dawkins, “The Greatest Show on Earth”
The trouble with testing your code for bugs, of course, is that you sometimes find them. Then what do you do?
Well, fixing them would be a start. In Target acquired, we wrote a fuzz test in
Go for some function FirstRune
that’s supposed to return
the first rune from a string. Here’s the test:
func FuzzFirstRune(f *testing.F) {
.Add("Hello")
f.Add("world")
f.Fuzz(func(t *testing.T, s string) {
f:= runes.FirstRune(s)
got , _ := utf8.DecodeRuneInString(s)
wantif want == utf8.RuneError {
.Skip() // don't bother testing invalid runes
t}
if want != got {
.Errorf("given %q (0x%[1]x): want '%c' (0x%[2]x)",
t, want)
s.Errorf("got '%c' (0x%[1]x)", got)
t}
})
}
And we implemented FirstRune
… poorly:
func FirstRune(s string) rune {
return 0
}
If the test didn’t fail with this function, we might start to worry a bit about whether it would detect any bugs. But it does indeed report, reassuringly:
go test -fuzz .
fuzz: elapsed: 0s, gathering baseline coverage: 0/2 completed
failure while testing seed corpus entry: FuzzFirstRune/seed#1
fuzz: elapsed: 0s, gathering baseline coverage: 0/2 completed
--- FAIL: FuzzFirstRune (0.02s)
--- FAIL: FuzzFirstRune (0.00s)
runes_test.go:22: given "world" (0x776f726c64): want 'w'
(0x77)
runes_test.go:24: got '' (0x0)
In other words, asking for the first rune in the string
"world"
gives us the rune value 0, which is incorrect: it
should be 'w'
.
So we’ve found the bug: this function does nothing useful whatsoever! What can we do to fix this?
Well, here’s a slightly more plausible version. Is this okay?
func FirstRune(s string) rune {
return rune(s[0])
}
We use the square bracket slice-index notation (s[0]
) to
extract the first byte of s
, and we return it having
converted it explicitly to the rune
type.
Let’s see what the fuzzer says about this:
go test -fuzz .
fuzz: elapsed: 0s, gathering baseline coverage: 0/2 completed
fuzz: elapsed: 0s, gathering baseline coverage: 2/2 completed,
now fuzzing with 8 workers
fuzz: minimizing 27-byte failing input file
fuzz: elapsed: 0s, minimizing
--- FAIL: FuzzFirstRune (0.16s)
--- FAIL: FuzzFirstRune (0.00s)
testing.go:1356: panic: runtime error: index out of range
[0] with length 0
[stack trace omitted]
What’s this “minimizing” about? When the fuzzer finds some input that fails the test, it tries smaller and smaller versions of it, looking for the minimal failing input:
fuzz: minimizing 27-byte failing input file
fuzz: elapsed: 0s, minimizing
In this example, we can see that it found a 27-byte input string that caused the fuzz target to fail. It then tries smaller and smaller versions of the input to see if they still fail. This process continues until it has found the minimal input that causes the fuzz target to fail.
And here it is:
--- FAIL: FuzzFirstRune (0.16s)
--- FAIL: FuzzFirstRune (0.00s)
testing.go:1356: panic: runtime error: index out of range
[0] with length 0
It seems that, in this case, the fuzzer managed to shrink the failing
input to a completely empty string (length 0
). This is
rather puzzling. What’s gone on here?
In fact, what’s probably happened is that the fuzzer generated a random 27-byte string whose first rune was multi-byte, causing the function to return the wrong answer. As the fuzzer tried shorter and shorter versions of the string, the same “wrong result” failure kept happening.
Finally, when it reduced the string to length zero, it got a
different failure: an “index out of range” panic. It so happens that
FirstRune
doesn’t cope well with empty input strings:
func FirstRune(s string) rune {
return rune(s[0]) // oops, slice overrun
}
As I’m sure you spotted right away, the slice index expression
s[0]
will panic when s
is empty, and indeed
that’s what we’re seeing. Let’s fix that:
func FirstRune(s string) rune {
if s == "" {
return utf8.RuneError
}
return rune(s[0])
}
It seems reasonable to follow the example set by the standard
library, which is to return utf8.RuneError
in this case.
Now the function won’t panic when given an empty input.
Detecting more subtle bugs
The fuzzer has already proved its worth, but how do we know there
aren’t still more bugs lurking in FirstRune
? Well, let’s
fuzz around and find out.
go test -fuzz .
fuzz: elapsed: 0s, gathering baseline coverage: 0/3 completed
fuzz: elapsed: 0s, gathering baseline coverage: 3/3 completed, now
fuzzing with 8 workers
fuzz: minimizing 32-byte failing input file
fuzz: elapsed: 0s, minimizing
--- FAIL: FuzzFirstRune (0.13s)
--- FAIL: FuzzFirstRune (0.00s)
fuzz_test.go:22: given "ʭ" (0xcaad): want 'ʭ' (0x2ad),
got 'Ê' (0xca)
Again, interpreting this output, we can see that the fuzzer found a 32-byte failing input, which it then managed to shrink to just 2 bytes:
given "ʭ" (0xcaad)
That is, this input string consists of the rune ʭ
, which
in UTF-8 is encoded as two bytes, 0xca
and
0xad
. If FirstRune
had correctly
decoded this string, it would have returned the rune ʭ
, the
same result as the oracle.
Instead, it mistakenly interpreted the first byte of the string
(0xca
) as a single-byte rune, in this case the letter
Ê
.
A victory for fuzzing! Even if we didn’t know about the distinction between runes and bytes in Go, and even if we might not have thought of using a string like “ʭ” as an example input, we were still able to find the bug.
Fixing the implementation
How should we have implemented FirstRune
correctly, then? Well, that’s not really important for the discussion of
fuzz testing, but I don’t like to leave anyone in suspense. Let’s see if
we can fix this bug.
There are a variety of ways we could write FirstRune
correctly, including the rather dull one of simply calling
utf8.DecodeRuneInString
, as the test does.
But here’s a slightly more interesting version, just for fun:
func FirstRune(s string) rune {
for _, r := range s {
return r
}
return utf8.RuneError
}
Why does this work? As you probably know, the range
operator iterates over a string by runes, not by bytes. So, on
each successive execution of this loop, r
will be the next
rune in the string, starting with the first.
But on this very first iteration, we hit the return
statement inside the loop body, and we return the first rune straight
away. After all, that’s the right answer!
In the case where s
is empty, the for
loop
will execute zero times; that is to say, not at all. We’ll go straight
to the end of the function and return the value 0.
I think this is correct, but let’s see if the fuzzer agrees. Or, rather, we’ll give the fuzzer a chance to find some counterexamples, and see if it succeeds:
go test -fuzz .
fuzz: elapsed: 0s, gathering baseline coverage: 0/4 completed
fuzz: elapsed: 0s, gathering baseline coverage: 4/4 completed, now
fuzzing with 8 workers
fuzz: elapsed: 3s, execs: 130517 (43494/sec), new interesting: 5
(total: 13)
fuzz: elapsed: 6s, execs: 268452 (45962/sec), new interesting: 5
(total: 13)
fuzz: elapsed: 9s, execs: 396881 (42797/sec), new interesting: 5
(total: 13)
...
And, in fact, I left this to run for a minute or two, but it didn’t find any more failing inputs. That doesn’t guarantee that there aren’t any, of course. Even the fuzzer can’t guarantee that.
What it can do is increase our confidence that we’ve found most of the low-hanging bugs. If we fuzz a function for a decent length of time and it doesn’t fail, that’s encouraging. If it does fail, we’ve generated a new test case that will help us catch regressions in the future. Either way, it’s fuzzing useful.
My book The Power of Go: Tests is a compendium of hard-won tips, tricks, and techniques for testing Go programs, including the largest and scariest codebases. Even more importantly, it’s all about how to use tests to drive great design, and build reliable programs that you can be proud of. I hope you’ll take a look.