A practical optimization for Go compiler

May 25, 2023
go golang compiler

Prologue

While reading random Go code, I see people writing something like this:

if respbody, err := io.ReadAll(bodyReader); err == nil {
	resp.ContentLength = int64(len(string(respbody)))
	resp.Data = respbody
}

respbody is []byte, so why not just simply do int64(len(respbody)), why there’s string conversion there?

The problem

For correctness, len(respbody) and len(string(respbody)) will give you the same result, since when the len builtin will give you number of bytes in string.

However, the compiler is not taught to recognize that case (probably because no one should ever write that code?) so the string conversion still happens, making your code slower:

$ cat x_test.go
package x

import (
	"crypto/rand"
	"testing"
)

func f(s []byte) int {
	return len(string(s))
}

func g(s []byte) int {
	return len(s)
}

func BenchmarkX(b *testing.B) {
	length := 10
	s := randBytes(length)
	b.Run("len(string([]byte))", func(b *testing.B) {
		for i := 0; i < b.N; i++ {
			if f(s) != length {
				b.Fatal("unexpected result")
			}
		}
	})
	b.Run("len([]byte)", func(b *testing.B) {
		for i := 0; i < b.N; i++ {
			if g(s) != length {
				b.Fatal("unexpected result")
			}
		}
	})
}

func randBytes(n int) []byte {
	b := make([]byte, n)
	rand.Read(b)
	return b
}
$ go1.20.4 test -bench=. -benchmem x_test.go
goos: linux
goarch: amd64
cpu: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
BenchmarkX/len(string([]byte))-8         	370225268	         3.345 ns/op	       0 B/op	       0 allocs/op
BenchmarkX/len([]byte)-8                 	1000000000	         0.2214 ns/op	       0 B/op	       0 allocs/op
PASS
ok  	command-line-arguments	1.815s

NOTE

Looking at the assmebly output of function f, you will know the slow part.

Solution

Normally, we should not make the compiler more complicated to handle an unusual case. However, a quick search on Github indicates that people is doing this unusual case in their code. Moreover, the fix would be quite simple, it’s worth to do this practical optimization for making user code better.

CL 497276 was sent and submitted to address the issue, and will be part of go1.21 release.

The benchmark now show nearly identical running time for both cases:

$ go version
go version devel go1.21-4042b90001 Wed May 24 15:04:44 2023 +0000 linux/amd64
$ go test -bench=. -benchmem x_test.go
goos: linux
goarch: amd64
cpu: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
BenchmarkX/len(string([]byte))-8         	1000000000	         0.2207 ns/op	       0 B/op	       0 allocs/op
BenchmarkX/len([]byte)-8                 	1000000000	         0.2192 ns/op	       0 B/op	       0 allocs/op
PASS
ok  	command-line-arguments	0.490s

Epilogue

Quote from Keith Randall:

Man, people are strange. Probably a holdover from Java where the length in bytes and runes is different. I feel like this should be in a code-cleanliness sanitizer somewhere.

Happy optimizing.

Till next time!


Improving parallel calls from C to Go performance

June 29, 2023
go golang cgo runtime

Enable inline static init for go 1.21

April 19, 2023
go golang compiler

go 1.20 upgrading note

February 13, 2023
go golang compiler