Fixing regression bug in Go compiler

August 17, 2021
go golang compiler

Prologue

On August 15, 2021, @johejo filed an issue about Go compiler internal compiler error. This is a regression bug, since when it happens on current tip, not in go1.17, nor in older go versions.

To reproduce the bug:

git clone https://gitlab.com/cznic/libc.git
cd libc/
go build ./...

and the error:

# modernc.org/libc
walk
.   RECOVER tc(1) INTER-interface {} # libc.go:90:21 INTER-interface {}
./libc.go:90:21: internal compiler error: walkExpr: switch 1 unknown op RECOVER

<truncated error>

git bisect points me to this commit.

Since when I reviewed CL 330192 before, I have some ideas in my mind, but I can’t know for sure until having a concrete, simpler reproducer.

Let try it!

Reproducing the bug

First, look at the line (./libc.go:90) where the panic happens:

if dmesgs {
	wd, err := os.Getwd()
	dmesg("%v: %v, wd %v, %v", origin(1), os.Args, wd, err)

	defer func() {
		if err := recover(); err != nil {
			dmesg("%v: CRASH: %v\n%s", origin(1), err, debug.Stack())
		}
	}()
}

Hmm…, nothing special, let try writing a minimal reproducer:

package p

func f() {
	defer func() {
		_ = recover()
	}()
}

compile it:

$ go tool compile p.go
$

Success!

So there must be something special. Looking at the original code again, I notice the condition if dmesgs, let see what is dmesgs. With help from gopls, I was able to see it’s a constant false:

const dmesgs = false

Let adjust the reproducer:

package p

func f() {
	if false {
		defer func() {
			_ = recover()
		}()
	}
}

Now:

$ go tool compile p.go
walk
.   RECOVER tc(1) INTER-interface {} # p.go:6:15 INTER-interface {}
p.go:6:15: internal compiler error: walkExpr: switch 1 unknown op RECOVER
<truncated output>

The bug is now reproducible, let’s examine why this happens.

Investigating

Since when the panic happens during walk pass, let use go tool compile -W to examine the generated AST:

$ go tool compile -W p.go
before walk f <nil>
after walk f <nil>

before walk f.func1
.   AS tc(1) # p.go:6:6
.   .   NAME-p._ tc(1) Offset:0 blank
.   .   RECOVER tc(1) INTER-interface {} # p.go:6:15 INTER-interface {}
walk
.   RECOVER tc(1) INTER-interface {} # p.go:6:15 INTER-interface {}
p.go:6:15: internal compiler error: walkExpr: switch 1 unknown op RECOVER

Oh! There’re two things that caught my eyes:

f body is empty because the compiler run an early deadcode pass after typechecking, thus it evals the if false condition and discards the if body.

f.func1 is the node that represents the function literal in defer call:

defer func() {
	_ = recover()
}()

At this time, I have enough information to know how the bug happens:

But in this case, the f.func1 isn’t part of f anymore, due to the deadcode pass above. Thus, the desugaring ORECOVER during escape analysis never happens for f.func1, causing the compiler goes boom!


NOTE

If you turn if false into if true, and re-run go tool compile -W, you will have a clearer picture.

I would leave it as an excercise for the readers.


Fixing

I sent CL 342350 to fix the bug.

The idea is simple:

Epilogue

Working on the Go compiler is quite fun, and help me learning a lot of thing. I encourage you to give this a try, and hope you will have the same feel!

If you have any question, feel free to shoot me an email.

Thanks for reading so far.

Till next time!


GopherCon 2023

October 2, 2023
gophercon community go golang

Improving parallel calls from C to Go performance

June 29, 2023
go golang cgo runtime

A practical optimization for Go compiler

May 25, 2023
go golang compiler