Don't lock around I/O

2018-04-10

Locks provide synchronization but may introduce significant performance problems if used poorly. One common place to find locks and performance problems are in HTTP handlers. In particular, it is easy to inadvertently lock around network I/O. To understand what this means, it helps to look at an example. For this post, we will be using Go.

To do that, we are going to build a small HTTP server that reports the number of requests it has received. All the code for this post may be found here.

A server that reports a number of requests might look like this:

package main

// import statements
// ...

const (
	payloadBytes = 1024 * 1024
)

var (
	mu    sync.Mutex
	count int
)

// register handler and start server in main
// ...

// BAD: Don't do this.
func root(w http.ResponseWriter, r *http.Request) {
	mu.Lock()
	defer mu.Unlock()

	count++

	msg := []byte(strings.Repeat(fmt.Sprintf("%d", count), payloadBytes))
	w.Write(msg)
}

The root handler uses the common pattern of locking and unlocking with a defer statement at the top of the function. Next, while still holding the lock, the handler increments count, creates a payload by repeating the count variable payloadBytes times, and finally writes the payload to the http.ResponseWriter.

To the untrained eye, this handler may look perfectly correct. In fact, there is a significant performance problem. The handler holds the lock around network I/O, which will cause the handler to execute only as fast as the slowest client.

To see this problem first hand, we need to simulate a slow client reader. In fact, it is partly because some clients are so slow that configuring timeouts is an absolute necessity for any Go HTTP server exposed directly on the open internet. The simulation can be tricky, though, on account of the ways the kernel will buffer writes to and reads from TCP sockets. Let's say we create a client which initiates a GET request, but never reads any data from the socket (see here). Will this be enough to cause the server to block on w.Write?

Because the kernel buffers reads and writes, we won't see any slowdown, at least until the buffer is full. So to observe the slowdown, we need to make sure every write fills the buffer. There are two ways to do this: 1) tune the kernel, or 2) write a large number of bytes each time.

Tuning the kernel is itself a fascinating subject. There is the proc directory, there is documentation on all the network-related parameters, there are multiple tutorials on host tuning. But for our purposes, we will take the easy route and simply write a megabyte of data into the socket, which overwhelms the TCP buffers on a vanilla Darwin (v17.4) kernel. Note, to run this demo yourself, you may have to adjust the number of bytes to ensure the buffers are filled.

Now if we start the server, we can use the slow client to observe how fast clients are forced to wait for the slow client. Again, the slow client is here.

First, confirm a request is handled quickly with:

curl localhost:8080/

# Output:
# numerous 1's without any meaningful delay

Now, this time run the slow client first:

# Assuming $GOPATH/src/github.com/gobuildit/gobuildit/lock directory
go run client/main.go

# Output:
dialing
sending GET request
blocking and never reading

With the slow client connected to the server, now try to run a "fast" client:

curl localhost:8080/

# Hangs

We see firsthand how our locking strategy inadvertently blocks faster clients. If we return to our handler and think about our use of the lock, this behavior will make sense.

func root(w http.ResponseWriter, r *http.Request) {
	mu.Lock()
	defer mu.Unlock()

	// ...
}

By locking at the top of the function and adding a deferred call to unlock, we are holding the lock for the duration of the handler. This includes manipulation of shared state, a read of that shared state, and a write over the network. And herein lies the problem. Network I/O is inherently unpredictable. Granted, we may configure timeouts to protect our server from excessively long calls, but we cannot say that all network I/O will complete within a fixed time period.

The key takeaway is to not lock around I/O. In the case here, locking around I/O provides no value whatsoever. By locking around I/O we are allowing our program to be susceptible to an unreliable network and to any slow clients. In effect, we are ceding partial control of our program's synchronization.

Let's rewrite the handler to lock around just the critical section.

// GOOD: Keep the critical section as small as possible and don't lock around
// I/O.
func root(w http.ResponseWriter, r *http.Request) {
	mu.Lock()
	count++
	current := count
	mu.Unlock()

	msg := []byte(strings.Repeat(fmt.Sprintf("%d", current), payloadBytes))
	w.Write(msg)
}

To see the difference, try testing with a slow client and a regular client.

Again, start the slow client:

# Assuming $GOPATH/src/github.com/gobuildit/gobuildit/lock directory
go run client/main.go

Now, use curl to send a request:

curl localhost:8080/

Observe how the curl client immediately returns with the expected request count.

Granted, this example is contrived and much simpler than typical production code. And for synchronized counters, one would be wise to consider the various functions in the atomics package. Nonetheless, I hope it illustrates the importance of thinking carefully about the scope of one's locks. Although there are always exceptions to the rule, in most cases a lock's scope should not include I/O.

Further Reading

Thanks to Jean de Klerk for reading an early draft of this post.