Go += Package Versioning
(Go & Versioning, Part 1)
Posted on Tuesday, February 20, 2018.
PDF
We need to add package versioning to Go.
More precisely, we need to add the concept of package versions to the working vocabulary
of both Go developers and our tools, so that they can all be precise when talking
to each other about exactly which program should be built, run, or analyzed.
The go
command needs to be able to tell developers exactly which versions
of which packages are in a particular build, and vice versa.
Versioning will let us enable reproducible builds, so that if I tell you to try the latest version of my program, I know you’re going to get not just the latest version of my code but the exact same versions of all the packages my code depends on, so that you and I will build completely equivalent binaries.
Versioning will also let us ensure that a program builds
exactly the same way tomorrow as it does today.
Even when there are newer versions of my dependencies,
the go
command shouldn’t start using them until asked.
Although we must add versioning,
we also must not remove the best parts of the current go
command:
its simplicity, speed, and understandability.
Today, many programmers mostly don’t pay attention to versioning,
and everything mostly works fine.
If we get the model and the defaults right,
we should be able to add versioning
in such a way that programmers still mostly don’t pay attention
to versioning, and everything just works better and is easier to understand.
Existing workflows should change as little as possible.
Releasing new versions should be very easy.
In general, version management work must fade to the background,
not be a day-to-day concern.
In short, we need to add package versioning, but we need to do it without breaking go
get
.
This post sketches a proposal for doing exactly that,
along with a prototype demonstration that you can try today
and that hopefully will be the basis for eventual go
command integration.
I intend this post to be the start of a productive discussion about
what works and what doesn’t. Based on that discussion, I will make adjustments to both the
proposal and the prototype, and then I will submit an
official Go proposal,
for integration into Go 1.11 as an opt-in feature.
This proposal keeps the best parts of go
get
,
adds reproducible builds,
adopts semantic versioning,
eliminates vendoring,
deprecates GOPATH in favor of a project-based workflow,
and provides for a smooth migration from dep
and its predecessors.
That said, this proposal is still also in its early stages.
If details are not right yet, we will take the time to
fix them before the work lands in the main Go distribution.
Background
Before we look at the proposal, let’s look at how we got where we are today. This is maybe a little long, but the history has important lessons for the present and helps to understand why the proposal changes what it does. If you are impatient, feel free to skip ahead to the proposal, or read the accompanying example blog post.
Makefiles, goinstall
, and go
get
In November 2009, the initial release of Go was a compiler, linker, and some libraries.
You had to run 6g
and 6l
to compile and link your programs,
and we included sample makefiles.
There was a minimal wrapper gobuild
that could build a single package
and write an appropriate makefile, in most cases.
There was no established way to share code with other people.
We knew more was needed, but we released what we had,
planning to do the rest with the community.
In February 2010, we proposed goinstall
,
a new, zero-configuration command for downloading packages from source control repositories like
Bitbucket and GitHub.
Goinstall
introduced the import path conventions Go developers take for granted today.
Because no code at the time followed those conventions,
goinstall
at first only worked with packages
that imported nothing beyond the standard library.
But developers quickly migrated from their own varied naming schemes
to the uniform convention we know today, and the set of
published Go packages grew into a coherent ecosystem.
Goinstall
also eliminated makefiles and, with them, the complexity
of user-defined build variations.
While it is occasionally inconvenient to package authors not to
be able to generate code during each build, that
simplification has been incredibly important to package users:
a user never has to worry about first installing the
same set of tools as the package author used before building a package.
The simplification has also been crucial to tooling.
A makefile is an imperative, step-by-step recipe for compiling
a package; reverse-engineering how to apply a different tool,
like go
vet
or code completion, to the same package, can be
quite difficult.
Even getting build dependencies right, so that packages are rebuilt
when necessary and only when necessary, is much
harder with arbitrary makefiles.
Although some people objected at the time that flexibility was being
taken away, it is clear in retrospect that the benefits far outweighed
the inconvenience.
In December 2011, as part of preparation for Go 1,
we introduced the go
command,
which replaced goinstall
with go
get
.
On the whole, go
get
has been transformative, enabling Go developers
to share source code and build on each other’s work,
and enabling tooling by
isolating details of the build system inside the go
command.
But go
get
is missing any concept of versioning.
It was clear in the very first discussions of goinstall
that
we needed to do something about versioning.
Unfortunately, it was not clear, at least to us on the Go team, exactly what to do.
When go
get
needs a package, it always fetches the latest copy,
delegating the download and update operations to version control
systems like Git or Mercurial.
This ignorance of package versioning has led to at least two significant shortcomings.
Versioning and API Stability
The first significant shortcoming of go
get
is that, without a concept of versioning,
it cannot convey to users any expectations about what kinds of changes to expect in a given update.
In November 2013, Go 1.2 added a FAQ entry about package versioning that gave this basic advice (unchanged as of Go 1.10):
Packages intended for public use should try to maintain backwards compatibility as they evolve. The Go 1 compatibility guidelines are a good reference here: don’t remove exported names, encourage tagged composite literals, and so on. If different functionality is required, add a new name instead of changing an old one. If a complete break is required, create a new package with a new import path.
In March 2014, Gustavo Niemeyer created gopkg.in,
advertising “stable APIs for the Go language.”
The domain is a version-aware GitHub redirector,
allowing import paths like gopkg.in/yaml.v1
and gopkg.in/yaml.v2
to refer to
different commits (perhaps on different branches) of a single Git repository.
Following semantic versioning, authors are expected to introduce
a new major version when making a breaking change,
so that later versions of a v1
import path can be expected
to be drop-in replacements for earlier ones, while a v2
import
path may be a completely different API.
In August 2015, Dave Cheney filed a proposal to adopt semantic versioning. That prompted an interesting discussion over the next few months, in which everyone seemed to agree that tagging code with semantic versions seemed like a fine idea, but no one knew the next step: what should tools do with these versions?
Any discussion of semantic versioning inevitably includes counterarguments citing Hyrum’s law, which states:
With a sufficient number of users of an API, it does not matter what you promise in the contract. All observable behaviors of your system will be depended on by somebody.
While Hyrum’s law is empirically true, semantic versioning is still a useful way to frame expectations about the relationships between releases. Updating from 1.2.3 to 1.2.4 should not break your code, while updating from 1.2.3 to 2.0.0 may. If your code stops working after an update to 1.2.4, the author is likely to welcome a bug report and issue a fix in 1.2.5. If your code stops working (or even compiling) after an update to 2.0.0, that change has a much greater chance of being intentional and a correspondingly lesser chance of being fixed to your code’s liking in 2.0.1.
Instead of concluding from Hyrum’s law that semantic versioning is impossible, I conclude that builds should be careful to use exactly the same versions of each dependency that the author did, unless forced to do otherwise. That is, builds should default to being as reproducible as possible.
Vendoring and Reproducible Builds
The second significant shortcoming of go
get
is that, without a concept of versioning,
it cannot ensure or even express the idea of a reproducible build.
There is no way to be sure that your users are compiling the
same versions of your code’s dependencies that you did.
In November 2013, the Go 1.2 FAQ also added this basic advice:
If you’re using an externally supplied package and worry that it might change in unexpected ways, the simplest solution is to copy it to your local repository. (This is the approach Google takes internally.) Store the copy under a new import path that identifies it as a local copy. For example, you might copy
"original.com/pkg"
to"you.com/external/original.com/pkg"
. Keith Rarick’sgoven
is one tool to help automate this process.
Goven
, which Keith Rarick had started in March 2012,
copied a dependency into your repository and also updated all the import paths
within it to reflect the new location.
Modifying the source code of the dependency in this way was necessary to
make it build but was also unfortunate.
The modifications made it harder to
compare against and incorporate newer copies and
required updates to other copied code using that
dependency.
In September 2013, Keith announced godep
,
“a new tool for freezing package dependencies.”
The main advance in godep
was to add what we now understand as Go vendoring—that is, to copy
dependencies into the project without modifying the source files—without direct toolchain support,
by setting up GOPATH in a certain way.
In October 2014, Keith proposed adding support for the concept of “external packages”
to the Go toolchain, so that tools could better understand
projects using that convention.
By then, there were multiple efforts similar to godep
.
Matt Farina wrote a blog post, “Glide in the Sea of Go Package Managers,”
comparing godep
with the newer arrivals, most notably glide
.
In April 2015, Dave Cheney introduced gb
,
a “project-based build tool ... that permits repeatable builds via source vendoring,”
again without import rewriting.
(Another motivation for gb
was to avoid the requirement that code be
stored in specific directories in GOPATH, which is not a good match
for many developer workflows.)
That spring, Jason Buberel surveyed the Go package management
landscape to understand what could be done to unify these multiple efforts
and avoid duplication and wasted work.
His survey made it clear to us on the Go team that the go
command
needed direct support for vendoring without import rewriting.
At the same time, Daniel Theophanes started a specification for
a file format to describe the exact provenance and version of code in a vendor directory.
In June 2015, we accepted Keith’s proposal as the Go 1.5 vendor experiment,
optional in Go 1.5 and enabled by default in Go 1.6.
We encouraged all vendoring tool authors to work with Daniel to adopt
a single metadata file format.
Incorporating the concept of vendoring into the Go toolchain allowed
program analysis tools like go
vet
to better understand projects using vendoring,
and today there are a dozen or so Go package managers or vendoring tools
that manage vendor directories.
On the other hand, because these tools all use different metadata file formats,
they do not interoperate and cannot easily share information about dependency requirements.
More fundamentally, vendoring is an incomplete solution to the
package versioning problem. It only provides reproducible builds.
It does nothing to help understand package versions and
decide which version of a package to use.
Package managers like glide
and dep
add the concept
of versioning onto Go builds implicitly, without direct toolchain support,
by setting up the vendor directory a certain way.
As a result, the many tools in the Go ecosystem cannot be made
properly aware of versions.
It’s clear that Go needs direct toolchain support for package versions.
An Official Package Management Experiment
At GopherCon 2016, a group of interested gophers got together
on Hack Day (now Community Day)
for a wide-ranging discussion of Go package management.
One outcome was the formation of a committee and an advisory group
for package management work,
with a goal of creating a new tool for Go package management.
The vision was for that tool to unify and replace the existing ones,
but it would still be implemented outside the direct toolchain, using vendor directories.
The committee—Andrew Gerrand, Ed Muller, Jessie Frazelle, and Sam Boyer, organized by Peter Bourgon—drafted a spec
and then, led by Sam, implemented it as dep
.
For background,
see Sam’s February 2016 post “So you want to write a package manager,”
his December 2016 post “The Saga of Go Dependency Management,”
and his July 2017 GopherCon talk, “The New Era of Go Package Management.”
Dep
serves many purposes:
it is an important improvement over current practice that’s usable today,
it is an important step toward a solution,
and it is also an experiment—we call it an “official experiment”—that
helps us learn more about what does and does not work well for Go developers.
But dep
is not a direct prototype of the
eventual go
command integration of package versioning.
It is a powerful, almost arbitrarily flexible way to explore the design space,
serving a role like makefiles did when we were grappling with how to build Go programs.
But once we understand the design space better and can narrow it down
to the few key features that must be supported,
it will help the Go ecosystem to remove the other features,
to reduce expressiveness,
to adopt enforced conventions
that make Go code bases more uniform and easier to understand
and make tooling easier to build.
This post is the beginning of the next step after dep
:
the first draft of a prototype of the final go
command integration,
the package management equivalent of goinstall
.
The prototype is a standalone command we call vgo
.
It is a drop-in replacement for the go
command,
but it adds support for package versioning.
This is a new experiment, and we will see what we can learn from it.
Like when we introduced goinstall
,
some code and projects already work with vgo
today,
and other projects will need changes to be made compatible.
We will be taking away some control and expressiveness,
just as we took away makefiles,
in service of simplifying the system and eliminating complexity
for users.
Generally, we are looking for early adopters to help us
experiment with vgo
, so that we can learn as much as possible from users.
Starting to experiment with vgo
does not mean ending support for dep
.
We will keep dep
available until the path to full go
command integration
is decided, implemented, and generally available.
We will also work to make the eventual transition from dep
to the
go
command integration, in whatever form it takes,
as smooth as possible.
Projects that have not yet converted to dep
can still reap real benefits from doing so.
(Note that both godep
and glide
have ended
active development and encourage migrating to dep
.)
Other projects may wish to move directly to vgo
, if it serves their needs already.
Proposal
The proposal for adding versioning to the go
command has four steps.
First, adopt the import compatibility rule hinted at by the Go FAQ and gopkg.in;
that is, establish the expectation that newer versions of a package with a given import path
should be backwards-compatible with older versions.
Second, use a simple, new algorithm, known as minimal version selection,
to choose which package versions are used in a given build.
Third, introduce the concept of a Go module, a group of packages versioned as a single unit
and that declare the minimum requirements that must be satisfied by their dependencies.
Fourth, define how to retrofit all this into the existing go
command,
so that basic workflows do not change significantly from today.
The rest of this section introduces each of these steps.
Other blog posts this week will go into more detail.
The Import Compatibility Rule
Nearly all pain in package management systems is caused by trying to tame incompatibility. For example, most systems allow package B to declare that it requires package D 6 or later, and then allow package C to declare that it requires D 2, 3, or 4, but not 5 or later. If you are writing package A, and you want to use both B and C, then you are out of luck: there is no one single version of D that can be chosen to build both B and C into A. There is nothing you can do about it: these systems say that what B and C did was acceptable—they effectively encourage it—so you are just stuck.
Instead of designing a system that inevitably leads to large programs not building, this proposal requires that package authors follow the import compatibility rule:
If an old package and a new package have the same import path,
the new package must be backwards-compatible with the old package.
The rule is a restatement of the suggestion from the Go FAQ, quoted earlier. The quoted FAQ text ended by saying, “If a complete break is required, create a new package with a new import path.” Developers today expect to use semantic versioning to express such a break, so we integrate semantic versioning into our proposal. Specifically, major version 2 and later can be used by including the version in the path, as in:
import "github.com/go-yaml/yaml/v2"
Creating v2.0.0, which in semantic versioning denotes a major break, therefore creates a new package with a new import path, as required by import compatibility. Because each major version has a different import path, a given Go executable might contain one of each major version. This is expected and desirable. It keeps programs building and allows parts of a very large program to update from v1 to v2 independently.
Expecting authors to follow the import compatibility rule lets us avoid trying to tame incompatibility, making the overall system exponentially simpler and the package ecosystem less fragmented. In practice, of course, despite the best efforts of authors, updates within the same major version do occasionally break users. Therefore, it’s important to use an upgrade mechanism that doesn’t upgrade too quickly. That brings us to the next step.
Minimal Version Selection
Nearly all package managers today, including dep
and cargo
,
use the newest allowed version of packages involved in the build.
I believe this is the wrong default,
for two important reasons.
First, the meaning of “newest allowed version” can change due to external events,
namely new versions being published.
Maybe tonight someone will introduce a new version of some dependency,
and then tomorrow the same sequence of commands you ran today
would produce a different result.
Second, to override this default, developers spend their time
telling the package manager “no, don’t use X,”
and then the package manager spends its time
searching for a way not to use X.
This proposal takes a different approach, which I call minimal version selection. It defaults to using the oldest allowed version of every package involved in the build. This decision does not change from today to tomorrow, because no older version will be published. Even better, to override this default, developers spend their time telling the package manager, “no, use at least Y,” and then the package manager can trivially decide which version to use. I call this minimal version selection because the versions chosen are minimal and also because the system as a whole is perhaps also minimal, avoiding nearly all the complexity of existing systems.
Minimal version selection allows modules to specify only minimum requirements for their dependency modules. It gives well-defined, unique answers for both upgrade and downgrade operations, and those operations are efficient to implement. It also allows the author of the overall module being built to specify dependency versions to exclude, or to specify that a specific dependency version be replaced by a forked copy, either in the local file system or published as its own module. These exclusions and replacements do not apply when the module is being built as a dependency of some other module. This gives users full control over how their own programs build, but not over how other people’s programs build.
Minimal version selection delivers reproducible builds by default, without a lock file.
Import compatibility is key to minimal version selection’s simplicity. Instead of users saying “no, that’s too new,” they can only say “no, that’s too old.” In that case, the solution is clear: use a (minimally) newer version. And newer versions are agreed to be acceptable replacements for older ones.
Defining Go Modules
A Go module is a collection of packages sharing a common import path prefix, known as the module path. The module is the unit of versioning, and module versions are written as semantic version strings. When developing using Git, developers will define a new semantic version of a module by adding a tag to the module’s Git repository. Although semantic versions are strongly preferred, referring to specific commits will be supported as well.
A module defines, in a new file called go.mod
, the minimum version requirements of other modules it depends on. For example, here is a simple go.mod
file:
// My hello, world. module "rsc.io/hello" require ( "golang.org/x/text" v0.0.0-20180208041248-4e4a3210bb54 "rsc.io/quote" v1.5.2 )
This file defines a module, identified by path rsc.io/hello
, which itself depends on two other modules: golang.org/x/text
and rsc.io/quote
. A build of a module by itself will always use the specific versions of required dependencies listed in the go.mod
file. As part of a larger build, it will only use a newer version if something else in the build requires it.
Authors will be expected to tag releases with semantic versions,
and vgo
encourages using tagged versions, not arbitrary commits.
The rsc.io/quote
module, served from github.com/rsc/quote
, has tagged versions, including v1.5.2.
The golang.org/x/text
module, however, does not yet provide tagged versions.
To name untagged commits, the pseudo-version v0.0.0-yyyymmddhhmmss-commit
identifies a specific commit made on the given date.
In semantic versioning, this string corresponds to a v0.0.0 prerelease, with prerelease identifier yyyymmddhhmmss-commit.
Semantic versioning precedence rules order such prereleases before v0.0.0 or any later version,
and they order prereleases by string comparison.
Placing the date first in the pseudo-version syntax ensures that string comparison matches date comparison.
In addition to requirements, go.mod
files can specify the exclusions and replacements mentioned in the previous section,
but again those are only applied when building the module directly, not when building the module as part of a larger program.
The examples illustrate all of these.
Goinstall
and old go
get
invoke version control tools like git
and hg
directly to download code,
leading to many problems, among them fragmentation:
users without bzr
cannot download code stored in Bazaar repositories, for example.
In contrast, modules are always zip archives served over HTTP.
Before, go
get
had special cases to choose the version control commands for popular code hosting sites.
Now, vgo
has special cases to use those hosting sites’ APIs to fetch archives.
The uniform representation of modules as zip archives makes possible
a trivial protocol for and implementation of a module-downloading proxy.
Companies or individuals can run proxies for any number of reasons,
including security and wanting to be able to work from cached copies
in case the originals are removed.
With proxies available to ensure availability
and go.mod
to define which code to use,
vendor
directories are no longer needed.
The go
command
The go
command must be updated to work with modules.
One significant change is that ordinary build commands,
like go
build
, go
install
, go
run
, and go
test
,
will resolve new dependencies on demand,
All it takes to use golang.org/x/text
in a brand new module
is to add an import to the Go source code and build the code.
The most significant change, though, is the end of GOPATH as
a required place to work on Go code.
Because the go.mod
file includes the full module path
and also defines the version of every dependency in use,
a directory with a go.mod
file marks the root of a
directory tree that serves as a self-contained work space,
separate from any other such directories.
Now you just git
clone
, cd
, and start writing.
Anywhere.
No GOPATH required.
What’s Next?
I’ve also posted “A Tour of Versioned Go,”
showing what it’s like to use vgo
.
See that post for how to download and experiment with vgo
today.
I’ll post more throughout the week to add details that I skipped in this post.
I encourage feedback in the comments on this post and the others,
and I’ll try to watch the Go subreddit and the golang-nuts mailing list too.
On Friday I will post a FAQ as the final blog post in the series
(at least for now).
Next week I will submit a formal Go proposal.
Please try vgo
.
Start tagging versions in your repositories.
Create and check in go.mod
files.
Note that if run in a repository that has an empty go.mod
but that does have an existing
dep
, glide
, glock
, godep
, godeps
, govend
, govendor
, or gvt
configuration file,
vgo
will use that to fill in the go.mod
file.
I’m excited for Go to take the long-overdue step of adding versions to its working vocabulary.
Some of the most common problems that developers run into when using Go are
the lack of reproducible builds,
go
get
ignoring release tags entirely,
the inability of GOPATH to comprehend multiple versions of a package,
and
wanting or needing to work in source directories outside GOPATH.
The design proposed here eliminates all these problems, and more.
Even so, I’m sure there are details that are wrong.
I hope our users will help us get this design right
by trying the new vgo
prototype and engaging in productive discussions.
I would like Go 1.11 to ship with preliminary support for Go modules,
as a kind of technology preview,
and then I’d like Go 1.12 to ship with official support.
In some later release, we’ll remove support for the old, unversioned go
get
.
That’s an aggressive schedule, though,
and if getting the functionality right
means waiting for later releases, we will.
I care very much about the transition
from old go
get
and the myriad vendoring tools to the new module system.
That process is just as important
to me as getting the functionality right.
If a successful transition means waiting for later releases, we will.
Thanks to Peter Bourgon, Jess Frazelle, Andrew Gerrand, and Ed Mueller,
and Sam Boyer
for their work on the package management committee
and for many helpful discussions over the past year.
Thanks also to Dave Cheney, Gustavo Niemeyer, Keith Rarick, and Daniel Theophanes
for key contributions to the story of Go and package versioning.
Thanks again to Sam Boyer for creating dep
, and to him and the dep
contributors
for all their work on it.
Thanks to everyone who has created or worked on the many earlier vendoring tools as well.
Finally, thanks to everyone who will help us move this proposal forward,
find and fix what’s wrong, and add package versioning to Go as smoothly
as possible.