Sep 29

Sep 29 Starving, sleeping, and yielding: understanding Go's scheduler

…let not thy left hand know what thy right hand doeth…
—Matthew 6:3 (KJV)

The thing about concurrency is that, when there’s more than one thing going on at once, it’s not always easy to know what’s going on. Writing concurrent programs in Go is easy, but understanding the behaviour (or, more frequently, misbehaviour) of those programs is another matter.

In Go go goroutines we set ourselves a puzzle to solve about this program:

package main

import (
    "fmt"
)

func goroutineB() {
    for i := range 10 {
        fmt.Println("Hello from goroutine B!", i)
    }
}

func main() {
    go goroutineB()
    for i := range 10 {
        fmt.Println("Hello from goroutine A!", i)
    }
}

Since we know that the go statement causes these two functions to execute concurrently, we feel that when we run the program, we should see hello messages from both goroutine A and goroutine B. But, instead, we see only this:

Hello from goroutine A! 0
Hello from goroutine A! 1
Hello from goroutine A! 2
Hello from goroutine A! 3
...
Hello from goroutine A! 9

Goroutine A (the main function) ran to completion, but goroutine B (the goroutineB function) apparently never ran at all. What’s up with that?

Starvation

Recall that we said goroutines can be in one of three states: running, blocked, or ready. So, at the beginning of main, there is only goroutine A, and not surprisingly it’s in the running state.

Then we come to the go statement:

go goroutineB()

This asks the scheduler to create a new goroutine whose task is to run the function goroutineB. But, since goroutine A is currently running, and doesn’t yet need to block for any reason, the go statement just causes the new goroutine to be added to the ready queue.

Remember, the scheduler doesn’t even look at the ready queue unless the currently running goroutine either finishes or blocks for some reason. And goroutine A still has useful work to do, printing its ten messages.

And when goroutine A reaches the end of its for loop, it then exits the main function, and that’s the end of the program! Poor goroutine B is still sitting in the ready queue, waiting for its chance to run, which will now never come (it has starved, in the jargon).

New goroutines

We’ve learned something useful and interesting about Go concurrency already. You might have thought that a program would wait until all its goroutines have run to completion, but that turns out not to be the case.

Instead, the program exits as soon as some goroutine gets to the end of the main function. If there are goroutines waiting in the ready and blocked queues, that’s too bad: they get terminated along with the rest of the program.

If a running goroutine has useful work to do and it never blocks or deliberately yields the CPU, then no other goroutines will ever run. Go multitasking is solely co-operative, in other words (there are some minor footnotes to this, but we needn’t bother with them for now).

You might perfectly well imagine that the go statement would pause the current goroutine and start the new one running straight away, but that turns out not to be the case either. Instead, the go statement merely adds the new task to the ready queue, to be run at some point in the future (if it ever gets the chance).

You can think of the go keyword as meaning, not “run this now”, but “run this later… maybe!” In the meantime, the current goroutine continues executing as long as it can.

Being the scheduler

By the way, there’s a useful technique for figuring out why a concurrent program isn’t behaving the way you expected. I call it “being the scheduler”. Go through the code, as we’ve just done, asking yourself at every point what goroutines exist and what state each of them is in. Look for go statements creating new goroutines, and blocking operations that cause state transitions.

If you do this carefully and methodically, you’ll be able to understand the behaviour of any concurrent program, and most importantly, its possible misbehaviour. For example, we diagnosed the problem with our “hello goroutine” program as one of starvation: even though we created goroutine B, it never had a chance to run before the program ended.

My book The Deeper Love of Go will help you gain this kind of understanding and confidence with Go programming, not just about concurrency, but other tricky topics such as pointers, maps, slices, interfaces, and error handling. If you’re enjoying this excerpt, do consider buying the book!

Yielding with `time.Sleep`

If we want to see evidence of both goroutines running, we need to make sure that each of them regularly yields the CPU, allowing the other to run. The simplest way to do that is to pause them using the time.Sleep function, from the time package:

time.Sleep(10 * time.Millisecond)

It doesn’t really matter how long we pause for: pausing for any time at all will block the goroutine and trigger the scheduler to run the next task in its ready queue.

You might easily assume, when you see a statement like time.Sleep, that it would cause the current goroutine to stay on the CPU and wait, spinning its wheels, so to speak, until the desired time has elapsed. But that’s not actually what happens, because in a multi-tasking system, it would be a waste of CPU time.

As long as there are other goroutines in the ready queue, meaning that they have useful work to do on the CPU, the scheduler can and will put them into action.

Meanwhile, the goroutine calling time.Sleep will be parked: it joins the blocked queue and stays there until the timer expires. At that point it will go back into the ready queue.

Interleaved goroutines

Here’s what the program looks like with a time.Sleep added to each loop:

package main

import (
    "fmt"
    "time"
)

func goroutineB() {
    for i := range 10 {
        fmt.Println("Hello from goroutine B!", i)
        time.Sleep(10 * time.Millisecond)
    }
}

func main() {
    go goroutineB()
    for i := range 10 {
        fmt.Println("Hello from goroutine A!", i)
        time.Sleep(10 * time.Millisecond)
    }
}

(Listing hello/4)

Let’s run it again:

Hello from goroutine A! 0
Hello from goroutine B! 0
Hello from goroutine B! 1
Hello from goroutine A! 1
Hello from goroutine B! 2
Hello from goroutine A! 2
Hello from goroutine A! 3
Hello from goroutine B! 3
...
Hello from goroutine A! 9
Hello from goroutine B! 9

That’s more like it. The messages from each goroutine are interleaved, showing that their execution overlaps. We can see from this output that goroutine A runs a loop iteration or two, followed by goroutine B, then back to A for a bit, and so on.

How long each goroutine runs in its turn on the CPU depends what other goroutines (such as the garbage collector) are doing, how busy your CPU is with other programs on your computer, and various other factors. We can’t know in advance exactly what the output of a program like this will be, only that the two goroutines will run concurrently.

It’s not too hard to see that this kind of uncertainty can cause problems in real concurrent computer systems. In the next post we’ll look at some potential consequences for an imaginary Go-powered bookstore named Happy Fun Books. Till then, happy fun reading!