Reproducible Builds
GDG Berlin Golang
20 April 2015
Dave Cheney
Dave Cheney
I feel a certain degree of trepidation on stage today. Not just because the size of the audience I am addressing, but because of the subject I will be discussing.
Dependency management in Go is the single most common question I have been asked consistently for several years now.
Dependency management is the equivilent of Python's GIL, a problem that everyone has, but one that has not been solved.
What I see is many people who are actively choosing to stay "standard library only", and I worry that more are silently sitting on the fence until a solution is found.
I have a requirement that at any time I can fetch the entire graph of source that went into a program, feed that to a compiler and produce a program that is identical to one created in the past.
This is the requirement I have, and this is the motivation for this talk. If you don't have this requirement, that's fine.
The plethora of tools that exist in this space shows that Go programmers have multiple, sometimes overlapping requirements. Again, that is fine, this is my solution for my requirements; it is my hope that I can convince you of it's utility to you, but again, if you don't share my requirements, I may not be successful in my arguments.
Out of scope
OK, so now I've told you what I want; I need to explain to you why I don't feel that I have it today.
import "github.com/pkg/sftp" # yes, but which revision!
The most obvious reason is the import statement inside a Go package does not provide enough information for go get to select from a set of revisions available in a remote code repository the specific revision to fetch.
That information simply isn't there.
There are two rules for successful dependency management in Go.
Rule 1: Things that are different must have different import paths.
Who has written a log or logger package, they might all be called "log", but they are not the same package.
This why we have namespaces, github.com/you/log, github.com/me/log.
Rule 2: Things that are the same must have the same import path.
Are these two packages the same, or are they different ?
github.com/lib/pq github.com/davecheney/foo/internal/github.com/lib/pq
They are the same, this is the same code -- this is obvious to a human, not a computer.
To a compiler these are different packages.
init() functions will run multiple times. database/sql.RegisterWe cannot add anything to the import syntax for two reasons
import "github.com/pkg/term" "{hash,tag,version}"We cannot embed anything in the import syntax
import "github.com/project/v7/library"
Leads to nightmarish scenarios where equality and type assertions are broken.
import "github.com/project/v9/lib" // registers itself as a dialer
import "github.com/project/dialer"
err := dialer.Dial("someurl")
fmt.Println(err == lib.ErrTimeout) => false
fmt.Printf("%T", err) => "lib.ErrTimeout"
fmt.Println(v7/lib.ErrTimeout == v9/lib.ErrTimeout) => falseSo the first, and longest standing solution to this problem is to always have a stable API.
If it worked, we wouldn't be having this conversation today
If my time in system administration taught me anything, it's the unexpected failures that get you. You can plan for the big disasters, but it turns out that the little disasters can be just as disruptive.
These are all little disasters, you can usually find the code again, maybe it's just a quick sed rewrite and you're back again.
But just like the big disasters, these little disasters are indistinguishable, code which built one day, won't build the next.
The moral of the story is, if you are responsible for delivering a product written in Go, you need to be responsible for all the source that goes into that product.
Tools which fixup $GOPATH after go get
Problems
$GOPATH manually when moving between projectsCopying, vendoring, rewriting the source, is the new position from the Go team.
Problems
Virtual env all the things!
Problems
Every one of the existing solutions is hamstrung by the fact it is working around the limitations of the go tool.
Stop using go get. Don't use the go tool at all.
So, we're talking about writing a new build tool for Go, not a wrapper around an existing tool.
A new build tool should be project based.
This one I find hard to accept, but Go developers do not want to have any sort of configuration file to build their code.
I find this hard to rationalize because most repos that I look have had dozens of turds in them, Gruntfiles, Dockerfiles, Werker configs, etc.
src in the path is a project.$GOPATH.package pdf // import "rsc.io/pdf"
If we're going to go the extreme of divorcing ourselves from the go tool then maybe we can fix a few other annoyances along the way
-tags something now just worksgo get ideas of correct DVCS usage, ie, can use ssh://github.com/...
% /usr/bin/gb
The problem is go get, not the import statement.
The go tool doesn't define the language, we can build a replacement.
go get github.com/constabulary/gb/...
$GOPATH.go build