Demystifying 'defer'

Demystifying 'defer'

defer is one of those interesting features of Go that doesn’t really have a direct equivalent in other languages. Its behaviour can be a little confusing at first, but once you understand how to use defer, it’s really neat—and we use it all the time in Go programs.

Let’s talk about defer, then. First, what is the problem that it even solves?

It often happens that we do something in a Go function which requires some kind of cleanup before the function exits. For example, whenever we open a file, we must also close it before exit, or that resource will leak (that is, it will hang around taking up memory, in principle forever). The same applies to many other kinds of resources, such as HTTP response bodies or network connections.

So how can we clean up such a resource? We can explicitly call Close on it, like this:

f, err := os.Open("testdata/somefile.txt")
... // do something with f
f.Close()

Fine! The resource is cleaned up. But a function can exit at any point by using return, not just at the end. This creates a potential problem, doesn’t it? What if we exit the function before f.Close() is called? That would mean f is never closed, and it will hang around in memory forever.

For simple, one-shot programs like CLI tools, that’s maybe not so serious, since the program will exit soon anyway. But for long-running programs like servers, leakage means that the program’s resource usage keeps growing, until eventually it crashes.

If we’re diligent and disciplined programmers, which no doubt we are, we can just try to remember to call f.Close() at every place the function can return. But nobody’s perfect, and if someone else were to make changes to this function, they might not even be aware that they were supposed to close f before exiting. If there are several resources to clean up, this could also lead to a lot of duplicated code.

What we really want is some way to say “When this function is about to exit, no matter where or how, make sure f is closed”.

It turns out that Go’s defer keyword is exactly what we’re looking for. A defer statement takes some function call (for example f.Close()) and defers it.

That is, it doesn’t call the function now, but it “remembers” it for later. And when the function is about to exit for whatever reason, that’s when the deferred function call actually happens.

Here’s what that might look like with our file example:

f, err := os.Open("testdata/somefile.txt")
if err != nil {
    return err
}
defer f.Close()
... // do stuff with f

What’s happening here? First, we try to obtain f by calling os.Open. This may fail, in which case we don’t need to worry about closing the file, because we couldn’t open it! So we return an error in that case.

But now we know that we successfully opened f, so we immediately defer f.Close(). This doesn’t close f, but it “remembers” that we need f.Close() to be called on exit.

And now we can do anything we want in the rest of the function, safe in the knowledge that whatever happens, f will be automatically closed, without us having to worry about it. That’s what defer does. It’s pretty cool.

Multiple defers

It’s not common, but it sometimes happens that we need to defer more than one function call. For example, perhaps we open two files, and we want to defer closing each one separately.

This is fine, and we can use the defer keyword as many times as we like in a function, and all the deferred calls will be executed when the function exits. It’s worth knowing that the calls are executed in reverse order: that is to say, last deferred, first run. This is sometimes referred to as stacking defers.

defer cleanup1() // executed last on exit
defer cleanup2() // executed first on exit

Named result parameters

As you may know, when we’re declaring a function’s result list, we can choose to give those results names, or not. (Usually we don’t need to.) When and why would we want to give them names, then?

One very important reason is documentation. After all, source code is written for humans, not computers. If it’s not apparent from a function signature what its results represent, giving them names can help. Take our example from earlier:

func location() (float64, float64, error) {

This function presumably gets your location, if the name is anything to go by, and returns two float64 results (and maybe an error). But what are those two results? We can make that clearer by giving them names:

func location() (latitude float64, longitude float64, error) {

Now the reader can see that the first value is the latitude co-ordinate (in some agreed co-ordinate system) and the second the longitude. We might have guessed that this was the case, but it’s nice to see it written down explicitly.

We don’t need to do anything special in the function, just because its results are named; we continue to return them as we would any other results:

return 50.5897, -4.6036

But there’s one handy side-effect; the result parameters are now available inside the function for us to refer to, just as though they were input parameters or local variables. So we could assign to them, for example:

latitude = 50.5897
longitude = -4.6036
return latitude, longitude

Naked returns considered harmful

The Go specification actually allows us to omit the names from the return statement in this case, and this would implicitly return whatever the values of latitude and longitude happen to be at this point. But even though that’s legal syntax, it’s not good practice.

It’s always clearer to specify the exact values or variables you’re returning, and there’s no benefit to omitting them. So you should avoid writing these so-called naked returns, even though you’ll sometimes see them in other people’s code.

In particular, you should be aware that just because a function has named result parameters, that doesn’t mean you must write a naked return. You can, and should, make your return values explicit.

Modifying result parameters after exit

So defer is useful for cleaning up resources before the function exits. Great. But what if this cleanup operation needs to change the function’s results?

For example, suppose we have some function which writes data to a file, and returns an error result to indicate whether it succeeded.

Naturally, we want to defer closing the file, to avoid leaking it. But that close operation itself could fail, and it’s quite likely that that mean we’ve lost the user’s data. Suppose we ran out of disk space just before we wrote the last byte, for example, and on trying to close the file we get some error.

What to do? Our function must return the error, in order to let the user know that we lost their data. But the error doesn’t happen until the deferred call to f.Close(), and at that point the function’s result value is already set. Consider this (partial) code:

... // we successfully opened f
defer f.Close()
... // write data to f
... // everything good, so return nil error
return nil

If the call to f.Close() happens to fail, returning some error, there’s nothing we can do about it, because the function is going to return nil no matter what happens. It’s baked into the return statement. Or is it?

Deferring a function literal

Because we can defer any function call, we don’t have to defer just f.Close(). Instead, we could write some function literal that calls f.Close(), and defer that:

defer func() {
    closeErr = f.Close()
    if closeErr != nil {
        fmt.Println("oh no")
    }
}()

(Note the empty parentheses at the end, after the function literal’s closing curly brace. We don’t defer a function, remember, but a function call, so having defined the anonymous function we want to defer, we then add the parentheses to call it.)

This is a significant improvement: we can catch the error returned by f.Close(), and if we can’t change the result returned by the function, we can at least bewail the situation by printing "oh no". That’s something, at least.

Deferring a closure

But we can do even better! If we named this function’s error result parameter (let’s call it err, to be conventional), then it’s available for setting inside the function. We can assign to err anywhere in the function, and that will be the value that the function ultimately returns.

How does that help in our deferred function literal? Well, a Go function literal is a closure: it can access all the variables in the closing scope—including err. Because it’s a closure on err, it can modify err before the function actually returns.

So we can opt to overwrite the value of err with whatever closeErr happens to be:

defer func() {
    closeErr = f.Close()
    if closeErr != nil {
        err = closeErr
    }
}()

We achieved the seemingly impossible: modifying the function’s result after it exited (but before it returned). All it took was a combination of named result parameters, defer, and a closure.

Note that it wouldn’t have been correct to simply set err to whatever f.Close() returned:

defer func() {
    err = f.Close() // this is wrong
}

Why not? Because err already has some value, specified by the function’s return statement. We don’t want to erase that and overwrite it with, for example, nil, if the close operation succeeds. The only time we want to overwrite err is if f.Close() fails.

You won’t need to do anything quite as tricky as this in most functions, but it’s occasionally extremely useful.

In practice, a better pattern to use when dealing with writable files is to call f.Sync when you’re finished writing. This tells the operating system to flush all unsaved data to disk. Then it’s safe to simply defer f.Close(), ignoring any error:

f, err := os.Create("testdata/somefile.txt")
... // handle error
defer f.Close()
... // write data to f
return f.Sync()

If f.Sync doesn’t succeed, then at least we returned that error back to our caller, and we also avoided leaking f through the deferred call to f.Close. This is shorter, clearer, and less magical than deferring a closure over named result parameters, so I prefer it.

Takeaways

  • The defer keyword in Go lets us defer execution of a block of code until the current function exits.

  • This is helpful when, for example, closing opened files or otherwise cleaning up resources that could leak. Stacked defers run in reverse order.

  • You can name a function’s result parameters, and the main value of this is to document what those results mean.

  • Naked returns (bare return statement with no arguments) are legal syntactically, and you’ll even see them used in some old blog posts and tutorials. But there’s no good reason to use them, and it’s always clearer to return some explicit value, such as nil.

  • When a function has named results, you can modify them in a deferred closure: for example, to return some error encountered when trying to clean up resources.

Read more

Review: 'Let's Go Further'

Review: 'Let's Go Further'

Don't fear the pointer

Don't fear the pointer