research!rsc
tag:research.swtch.com,2012:research.swtch.com
2019-12-03T14:01:00-05:00
Russ Cox
https://swtch.com/~rsc
rsc@swtch.com
The Principles of Versioning in Go
tag:research.swtch.com,2012:research.swtch.com/vgo-principles
2019-12-03T14:00:00-05:00
2019-12-03T14:02:00-05:00
The rationale behind the Go modules design. (Go & Versioning, Part 11)
<p>
This blog post is about how we added package versioning to Go, in the form of Go modules,
and the reasons we made the choices we did.
It is adapted and updated
from a <a href="https://www.youtube.com/watch?v=F8nrpe0XWRg">talk I gave at GopherCon Singapore in 2018</a>.
<a class=anchor href="#why"><h2 id="why">Why Versions?</h2></a>
<p>
To start, let’s make sure we’re all on the same page, by taking a look at the ways
the GOPATH-based <code>go</code> <code>get</code> breaks.
<p>
Suppose we have a fresh Go installation and we want to write a program that imports D.
We run <code>go</code> <code>get</code> <code>D</code>.
Remember that we are using the original GOPATH-based <code>go</code> <code>get</code>,
not Go modules.
<pre>$ go get D
</pre>
<p>
<img name="vgo-why-1" class="center pad" width=200 height=39 src="vgo-why-1.png" srcset="vgo-why-1.png 1x, vgo-why-1@1.5x.png 1.5x, vgo-why-1@2x.png 2x, vgo-why-1@3x.png 3x, vgo-why-1@4x.png 4x">
<p>
That looks up and downloads the latest version of D, which right now is D 1.0.
It builds.
We’re happy.
<p>
Now suppose a few months later we need C.
We run <code>go</code> <code>get</code> <code>C</code>.
That looks up and downloads the latest version of C, which is C 1.8.
<pre>$ go get C
</pre>
<p>
<img name="vgo-why-2" class="center pad" width=201 height=96 src="vgo-why-2.png" srcset="vgo-why-2.png 1x, vgo-why-2@1.5x.png 1.5x, vgo-why-2@2x.png 2x, vgo-why-2@3x.png 3x, vgo-why-2@4x.png 4x">
<p>
C imports D, but <code>go</code> <code>get</code> finds that it has already downloaded a copy of D,
so it reuses that copy.
Unfortunately, that copy is still D 1.0.
The latest copy of C was written using D 1.4,
which contains a feature or maybe a bug fix that C needs
and which was missing from D 1.0.
So C is broken, because the dependency D is too old.
<p>
Since the build failed, we try again, with <code>go</code> <code>get</code> <code>-u</code> <code>C</code>.
<pre>$ go get -u C
</pre>
<p>
<img name="vgo-why-3" class="center pad" width=201 height=104 src="vgo-why-3.png" srcset="vgo-why-3.png 1x, vgo-why-3@1.5x.png 1.5x, vgo-why-3@2x.png 2x, vgo-why-3@3x.png 3x, vgo-why-3@4x.png 4x">
<p>
Unfortunately, an hour ago D’s author published D 1.6.
Because <code>go</code> <code>get</code> <code>-u</code> uses the latest version of every dependency,
including D,
it turns out that C is still broken.
C’s author used D 1.4, which worked fine,
but D 1.6 has introduced a bug that keeps C from working properly.
Before, C was broken because D was too old.
Now, C is broken because D is too new.
<p>
Those are the two ways that <code>go</code> <code>get</code> fails when using GOPATH.
Sometimes it uses dependencies that are too old.
Other times it uses dependencies that are too new.
What we really want in this case is the version of D
that C’s author used and tested against.
But GOPATH-based <code>go</code> <code>get</code> can’t do that,
because it has no awareness of package versions at all.
<p>
Go programmers started asking for better handling
of package versions as soon as we published <code>goinstall</code>,
the original name for <code>go</code> <code>get</code>.
Various tools were written over many years,
separate from the Go distribution,
to help make installing specific versions easier.
But because those tools did not agree on a single approach,
they didn’t work as a base for creating other version-aware tools,
such as a version-aware godoc
or a version-aware vulnerability checker.
<p>
We needed to add the concept of package versions to Go
for many reasons.
The most pressing reason was to
help <code>go</code> <code>get</code> stop using code that’s too old or too new,
but having an agreed-upon meaning of versions in the
vocabulary of Go developers and tools
enables the entire Go ecosystem to become version-aware.
The <a href="https://blog.golang.org/module-mirror-launch">Go module mirror and checksum database</a>,
which safely speed up Go package downloads,
and the new <a href="https://blog.golang.org/go.dev#Explore">version-aware Go package discovery site</a>
are both made possible by an ecosystem-wide understanding
of what a version is.
<a class=anchor href="#eng"><h2 id="eng">Versions for Software Engineering</h2></a>
<p>
Over the past two years, we have added support for package versions to Go itself,
in the form of Go modules, built into the <code>go</code> command.
Go modules introduce a new import path syntax called
semantic import versioning,
along with a new algorithm for selecting which versions to use,
called minimal version selection.
<p>
You might wonder: Why not do what other languages do?
Java has Maven, Node has NPM, Ruby has Bundler, Rust has Cargo.
How is this not a solved problem?
<p>
You might also wonder: We introduced a new, experimental Go tool called Dep in early 2018
that implemented the general approach pioneered by Bundler and Cargo.
Why did Go modules not reuse Dep’s design?
<p>
The answer is that we learned from Dep that the general Bundler/Cargo/Dep
approach includes some decisions that make software engineering
more complex and more challenging.
Thanks to learning about the problems were in Dep’s design,
the Go modules design made different decisions,
to make software engineering simpler and easier instead.
<p>
But what is software engineering?
How is software engineering different from programming?
I like <a href="https://research.swtch.com/vgo-eng">the following definition</a>:<blockquote>
<p>
<i>Software engineering is what happens to programming
<br>when you add time and other programmers.</i></blockquote>
<p>
Programming means getting a program working.
You have a problem to solve, you write some Go code,
you run it, you get your answer, you’re done.
That’s programming, and that’s difficult enough by itself.
<p>
But what if that code has to keep working, day after day?
What if five other programmers need to work on the code too?
What if the code must adapt gracefully as requirements change?
Then you start to think about version control systems,
to track how the code changes over time
and to coordinate with the other programmers.
You add unit tests,
to make sure bugs you fix are not reintroduced over time,
not by you six months from now,
and not by that new team member who’s unfamiliar with the code.
You think about modularity and design patterns,
to divide the program into parts that team members
can work on mostly independently.
You use tools to help you find bugs earlier.
You look for ways to make programs as clear as possible,
so that bugs are less likely.
You make sure that small changes can be tested quickly,
even in large programs.
You’re doing all of this because your programming
has turned into software engineering.
<p>
(This definition and explanation of software engineering
is my riff on an original theme by my Google colleague Titus Winters,
whose preferred phrasing is “software engineering is programming integrated over time.”
It’s worth seven minutes of your time to see
<a href="https://www.youtube.com/watch?v=tISy7EJQPzI&t=8m17s">his presentation of this idea at CppCon 2017</a>,
from 8:17 to 15:00 in the video.)
<p>
Nearly all of Go’s distinctive design decisions
were motivated by concerns about software engineering.
For example, most people think that we format Go code with <code>gofmt</code>
to make code look nicer or to end debates among
team members about program layout.
And to some degree we do.
But the more important reason for <code>gofmt</code>
is that if an algorithm defines how Go source code is formatted,
then programs, like <code>goimports</code> or <code>gorename</code> or <code>go</code> <code>fix</code>,
can edit the source code more easily.
This helps you maintain code over time.
<p>
As another example, Go import paths are URLs.
If code imported <code>"uuid"</code>,
you’d have to ask which <code>uuid</code> package.
Searching for <code>uuid</code> on <a href="https://pkg.go.dev/"><i>pkg.go.dev</i></a>
turns up dozens of packages with that name.
If instead the code imports <code>"github.com/google/uuid"</code>,
now it’s clear which package we mean.
Using URLs avoids ambiguity
and also reuses an existing mechanism for giving out names,
making it simpler and easier to coordinate with other programmers.
Continuing the example,
Go import paths are written in Go source files,
not in a separate build configuration file.
This makes Go source files self-contained,
which makes it easier to understand, modify, and copy them.
These decisions were all made toward the goal of
simplifying software engineering.
<a class=anchor href="#principles"><h2 id="principles">Principles</h2></a>
<p>
There are three broad principles behind the changes from Dep’s design to Go modules,
all motivated by wanting to simplify software engineering.
These are the principles of compatibility, repeatability, and cooperation.
The rest of this post explains each principle,
shows how it led us to make a different decision for Go modules than in Dep,
and then responds, as fairly as I can, to objections against making that change.
<a class=anchor href="#compatibility"><h2 id="compatibility">Principle #1: Compatibility</h2></a>
<blockquote>
<p>
<i>The meaning of a name in a program should not change over time.</i></blockquote>
<p>
The first principle is compatibility.
Compatibility—or, if you prefer, stability—is the idea that, in a program,
the meaning of a name should not change over time.
If a name meant one thing last year,
it should mean the same thing this year and next year.
<p>
For example, programmers are sometimes confused
by a detail of <code>strings.Split</code>.
We all expect that splitting “<code>hello</code> <code>world</code>”
produces two strings “<code>hello</code>” and “<code>world</code>.”
But if the input has leading, trailing, or repeated spaces,
the result contains empty strings too.
<pre>Example: strings.Split(x, " ")
"hello world" => {"hello", "world"}
"hello world" => {"hello", "", "world"}
" hello world" => {"", "hello", "world"}
"hello world " => {"hello", "world", ""}
</pre>
<p>
Suppose we decide that it would be better overall
to change the behavior of <code>strings.Split</code> to omit
those empty strings.
Can we do that?
<p>
No.
<p>
We’ve given <code>strings.Split</code> a specific meaning.
The documentation and the implementation agree on that meaning.
Programs depend on that meaning.
Changing the meaning would break those programs.
It would break the principle of compatibility.
<p>
We <i>can</i> implement the new meaning; we just need to give a new name too.
In fact, years ago, to solve this exact problem,
we introduced <code>strings.Fields</code>, which is tailored to space-separated fields
and never returns empty strings.
<pre>Example: strings.Fields(x)
"hello world" => {"hello", "world"}
"hello world" => {"hello", "world"}
" hello world" => {"hello", "world"}
"hello world " => {"hello", "world"}
</pre>
<p>
We didn’t redefine <code>strings.Split</code>, because we were concerned about compatibility.
<p>
Following the principle of compatibility simplifies software engineering,
because it lets you ignore time when trying to understand programming.
People don’t have to think, “well this package was written in 2015,
back when <code>strings.Split</code> returned empty strings, but this other package
was written last week, so it expects <code>strings.Split</code> to leave them out.”
And not just people. Tools don’t have to worry about time either.
For example, a refactoring tool can always move a <code>strings.Split</code> call
from one package to another
without worrying that it will change its meaning.
<p>
In fact, the most important feature of Go 1
was not a language change or a new library feature.
It was the declaration of compatibility:<blockquote>
<p>
It is intended that programs written to the Go 1 specification
will continue to compile and run correctly, unchanged,
over the lifetime of that specification.
Go programs that work today should continue to work
even as future “point” releases of Go 1 arise (Go 1.1, Go 1.2, etc.).
<p>
— <a href="https://golang.org/doc/go1compat"><i>golang.org/doc/go1compat</i></a></blockquote>
<p>
We committed that we would stop changing the meaning of names
in the standard library,
so that programs working with Go 1.1 could be
expected to continue working in Go 1.2, and so on.
That ongoing commitment makes it easy for users to write code and keep it working
even as they upgrade to newer Go versions to get
faster implementations and new features.
<p>
What does compatibility have to do with versioning?
It’s important to think about compatibility
because the most popular approach to versioning today—<a href="https://semver.org">semantic versioning</a>—instead encourages <i>incompatibility</i>.
That is, semantic versioning has the unfortunate effect of making incompatible changes seem easy.
<p>
Every semantic version takes the form vMAJOR.MINOR.PATCH.
If two versions have the same major number,
the later (if you like, greater) version is expected to be backwards compatible
with the earlier (lesser) one.
But if two versions have different major numbers,
they have no expected compatibility relationship.
<p>
Semantic versioning seems to suggest, “It’s okay to make incompatible
changes to your packages.
Tell your users about them by incrementing the major version number.
Everything will be fine.”
But this is an empty promise.
Incrementing the major version number isn’t enough.
Everything is not fine.
If <code>strings.Split</code> has one meaning today and a different meaning tomorrow,
simply reading your code is now software engineering,
not programming, because you need to think about time.
<p>
It gets worse.
<p>
Suppose B is written to expect <code>strings.Split</code> v1,
while C is written to expect <code>strings.Split</code> v2.
That’s fine if you build each by itself.
<p>
<img name="vgo-why-4" class="center pad" width=312 height=60 src="vgo-why-4.png" srcset="vgo-why-4.png 1x, vgo-why-4@1.5x.png 1.5x, vgo-why-4@2x.png 2x, vgo-why-4@3x.png 3x, vgo-why-4@4x.png 4x">
<p>
But what happens when your package A imports both B and C?
If <code>strings.Split</code> has to have just one meaning,
there’s no way to build a working program.
<p>
<img name="vgo-why-5" class="center pad" width=282 height=94 src="vgo-why-5.png" srcset="vgo-why-5.png 1x, vgo-why-5@1.5x.png 1.5x, vgo-why-5@2x.png 2x, vgo-why-5@3x.png 3x, vgo-why-5@4x.png 4x">
<p>
For the Go modules design, we realized that the principle of
compatibility is absolutely essential to simplifying software engineering
and must be supported, encouraged, and followed.
The Go FAQ has encouraged compatibility since Go 1.2 in November 2013:<blockquote>
<p>
Packages intended for public use should try to maintain backwards compatibility as they evolve.
The <a href="https://golang.org/doc/go1compat.html">Go 1 compatibility guidelines</a> are a good reference here:
don’t remove exported names,
encourage tagged composite literals, and so on.
If different functionality is required, add a new name instead of changing an old one.
If a complete break is required, create a new package with a new import path.</blockquote>
<p>
For Go modules, we gave this old advice a new name, the <i>import compatibility rule</i>:<blockquote>
<p>
<i>If an old package and a new package have the same import path,<br>
the new package must be backwards compatible with the old package.</i></blockquote>
<p>
But then what do we do about semantic versioning?
If we still want to use semantic versioning,
as many users expect,
then the import compatibility rule requires
that different semantic major versions,
which by definition have no compatibility relationship,
must use different import paths.
The way to do that in Go modules is to put the major version in the import path.
We call this <i>semantic import versioning</i>.
<p>
<img name="impver" class="center pad" width=458 height=223 src="impver.png" srcset="impver.png 1x, impver@1.5x.png 1.5x, impver@2x.png 2x, impver@3x.png 3x, impver@4x.png 4x">
<p>
In this example, <code>my/thing/v2</code> identifies semantic version 2 of a particular module.
Version 1 was just <code>my/thing</code>, with no explicit version in the module path.
But when you introduce major version 2 or larger, you have to add the version after the module name,
to distinguish from version 1 and other major versions,
so version 2 is <code>my/thing/v2</code>, version 3 is <code>my/thing/v3</code>, and so on.
<p>
If the <code>strings</code> package were its own module,
and if for some reason we really needed to redefine <code>Split</code> instead of adding a new function <code>Fields</code>,
then we could create <code>strings</code> (major version 1) and <code>strings/v2</code> (major version 2),
with different <code>Split</code> functions.
Then the unbuildable program from before can be built:
B says <code>import</code> <code>”strings"</code>
while C says <code>import</code> <code>"strings/v2"</code>.
Those are different packages,
so it’s okay to build both into the program.
And now B and C can each have the <code>Split</code> function they expect.
<p>
<img name="vgo-why-6" class="center pad" width=299 height=94 src="vgo-why-6.png" srcset="vgo-why-6.png 1x, vgo-why-6@1.5x.png 1.5x, vgo-why-6@2x.png 2x, vgo-why-6@3x.png 3x, vgo-why-6@4x.png 4x">
<p>
Because <code>strings</code> and <code>strings/v2</code> have different
import paths, people and tools automatically
understand that they name different packages,
just as people already understand that
<code>crypto/rand</code> and <code>math/rand</code> name different packages.
No one needs to learn a new disambiguation rule.
<p>
Let’s return to the unbuildable program, not using semantic import versioning.
If we replace <code>strings</code> in this example with an arbitrary package D,
then we have a classic “diamond dependency problem.”
Both B and C build fine by themselves,
but with different, conflicting requirements for D.
If we try to use both in a build of A,
then there’s no single choice of D that works.
<p>
<img name="vgo-why-7" class="center pad" width=282 height=94 src="vgo-why-7.png" srcset="vgo-why-7.png 1x, vgo-why-7@1.5x.png 1.5x, vgo-why-7@2x.png 2x, vgo-why-7@3x.png 3x, vgo-why-7@4x.png 4x">
<p>
Semantic import versioning cuts through diamond dependencies.
There’s no such thing as conflicting requirements for D.
D version 1.3 must be backwards compatible with D version 1.2,
and D version 2.0 has a different import path, D/v2.
<p>
<img name="vgo-why-8" class="center pad" width=289 height=94 src="vgo-why-8.png" srcset="vgo-why-8.png 1x, vgo-why-8@1.5x.png 1.5x, vgo-why-8@2x.png 2x, vgo-why-8@3x.png 3x, vgo-why-8@4x.png 4x">
<p>
A program using both major versions keeps them as separate as any
two package with different import paths and builds fine.
<a class=anchor href="#aesthetics"><h2 id="aesthetics">Objection: Aesthetics</h2></a>
<p>
The most common objection to semantic import versioning
is that people don’t like seeing the major versions in the import paths.
In short, they’re ugly.
Of course, what this really means is only that people are not used
to seeing the major version in import paths.
<p>
I can think of two examples of major aesthetic shifts in Go code
that seemed ugly at the time but were adopted because they simplified software engineering
and now look completely natural.
<p>
The first example is export syntax.
Back in early 2009, Go used an <code>export</code> keyword to mark a function as exported.
We knew we needed something more lightweight to mark individual struct fields,
and we were casting about for ideas,
considering things like “leading underscore means unexported” or “leading plus in declaration means export.”
Eventually we hit on the “upper-case for export” idea.
Using an upper-case letter as the export signal looked strange to us,
but that was the only drawback we could find.
Otherwise, the idea was sound, it satisfied our goals,
and it was more appealing than the other choices we’d been considering.
So we adopted it.
I remember thinking that changing <code>fmt.printf</code> to <code>fmt.Printf</code> in my code was ugly, or at least jarring:
to me, <code>fmt.Printf</code> didn’t look like Go, at least not the Go I had been writing.
But I had no good argument against it, so I went along with (and implemented) the change.
After a few weeks, I got used to it,
and now it is <code>fmt.printf</code> that doesn’t look like Go to me.
What’s more, I came to appreciate the precision about
what is and isn’t exported when reading code.
When I go back to C++ or Java code now and I see a call like
<code>x.dangerous()</code> I miss being able to tell at a glance whether the
<code>dangerous</code> method is a public method that anyone can call.
<p>
The second example is import paths, which I mentioned briefly earlier.
In the early days of Go, before <code>goinstall</code> and <code>go</code> <code>get</code>,
import paths were not full URLs.
A developer had to manually download and install a package named <code>uuid</code>
and then would write <code>import</code> <code>"uuid"</code>.
Changing to URLs for import paths
(<code>import</code> <code>"github.com/google/uuid"</code>)
eliminated this ambiguity, and the added precision made <code>go</code> <code>get</code> possible.
People did complain at first,
but now the longer paths are second nature to us.
We rely on and appreciate their precision,
because it makes our software engineering work simpler.
<p>
Both these changes—upper-case for export
and full URLs for import paths—were motivated by
good software engineering arguments to which the only real
objection was visual aesthetics.
Over time we came to appreciate the benefits,
and our aesthetic judgements adapted.
I expect the same to happen with major versions in import paths.
We’ll get used to them, and we’ll come to value the
precision and simplicity they bring.
<a class=anchor href="#update"><h2 id="update">Objection: Updating Import Paths</h2></a>
<p>
Another common objection
is that upgrading from (say)
v2 of a module
to v3 of the same module
requires changing all the import paths referring to that module,
even if the client code doesn’t need any other changes.
<p>
It’s true that the upgrade requires rewriting import paths,
but it’s also easy to write a tool to do
a global search and replace.
We intend to make it possible to handle such upgrades with <code>go</code> <code>fix</code>,
although we haven’t implemented that yet.
<p>
Both the previous objection and this one
implicitly suggest keeping the
major version information only in a separate version metadata file.
If we do that, then
an import path won’t be precise enough to identify semantics,
like back when <code>import "uuid"</code> might have meant any one of dozens of different packages.
All programmers and tools will have to
look in the metadata file to answer the question: which major version is this?
Which <code>strings.Split</code> am I calling?
What happens when I copy a file from one module to another
and forget to check the metadata file?
If instead we keep import paths semantically precise,
then programmers and tools don’t need to be taught
a new way to keep different major versions of a package separate.
<p>
Another benefit of having the major version in the import path
is that when you do update from v2 to v3 of a package,
you can <a href="https://talks.golang.org/2016/refactor.article">update your program gradually</a>,
in stages, maybe one package at a time,
and it’s always clear which code has been converted and which has not.
<a class=anchor href="#multiple"><h2 id="multiple">Objection: Multiple Major Versions in a Build</h2></a>
<p>
Another common objection
is that having D v1 and D v2 in the same build
should be disallowed entirely.
That way, D’s author won’t have
to think about the complexities that arise from that situation.
For example, maybe package D defines a command line flag
or registers an HTTP handler,
so that building both D v1 and D v2
into a single program would fail without explicit coordination
between those versions.
<p>
Dep enforces exactly this restriction,
and some people say it is simpler.
But this is simplicity only for D’s author.
It’s not simplicity for D’s users,
and normally users outnumber authors.
If D v1 and D v2 cannot coexist in a single build,
then diamond dependencies are back.
You can’t convert a large program from D v1 to D v2 gradually,
the way I just explained.
In internet-scale projects,
this will fragment the Go package ecosystem into incompatible
groups of packages: those that use D v1 and those that use D v2.
For a detailed example, see my 2018 blog post, “<a href="https://research.swtch.com/vgo-import">Semantic Import Versioning</a>.”
<p>
Dep was forced to disallow multiple major versions in a build
because the Go build system requires each import path to
name a unique package (and Dep did not consider semantic import versioning).
In contrast, Cargo and other systems do allow multiple major versions in a build.
As I understand it, the reason these systems allow multiple versions
is the same reason that Go modules does:
not allowing them makes it too hard to work on large programs.
<a class=anchor href="#exp"><h2 id="exp">Objection: Too Hard to Experiment</h2></a>
<p>
A final objection is that versions in import paths are
unnecessary overhead when you’re
just starting to design a package,
you have no users,
and you’re making frequent backwards-incompatible changes.
That’s absolutely true.
<p>
Semantic versioning makes an exception for exactly that situation.
In major version 0, there are no compatibility expectations at all,
so that you can iterate quickly when you’re first starting out
and not worry about compatibility.
For example, v0.3.4 doesn’t need to be backwards compatible with
anything else: not v0.3.3, not v0.0.1, not v1.0.0.
<p>
Semantic import versioning makes a similar exception:
major version 0 is not mentioned in import paths.
<p>
In both cases, the rationale is that time has not entered the picture.
You’re not doing software engineering yet.
You’re just programming.
Of course, this means that if you use v0 versions of other people’s packages,
then you are accepting that new versions of those packages might include
breaking API changes without a corresponding import path change,
and you take on the responsibility
to update your code when that happens.
<a class=anchor href="#repeatability"><h2 id="repeatability">Principle #2: Repeatability</h2></a>
<blockquote>
<p>
<i>The result of a build of a given version of a package should not change over time.</i></blockquote>
<p>
The second principle is repeatability for program builds.
By repeatability I mean
that when you are building a specific version of a package,
the build should decide which dependency versions
to use in a way that’s repeatable,
that doesn’t change over time.
My build today
should match your build of my code tomorrow
and any other programmer’s build next year.
Most package management systems don’t make that guarantee.
<p>
We saw earlier how GOPATH-based <code>go</code> <code>get</code> doesn’t provide repeatability.
First <code>go</code> <code>get</code> used a version of D that was too old:
<p>
<img name="vgo-why-2" class="center pad" width=201 height=96 src="vgo-why-2.png" srcset="vgo-why-2.png 1x, vgo-why-2@1.5x.png 1.5x, vgo-why-2@2x.png 2x, vgo-why-2@3x.png 3x, vgo-why-2@4x.png 4x">
<p>
Then <code>go</code> <code>get</code> <code>-u</code> used a version of D that was too new:
<p>
<img name="vgo-why-3" class="center pad" width=201 height=104 src="vgo-why-3.png" srcset="vgo-why-3.png 1x, vgo-why-3@1.5x.png 1.5x, vgo-why-3@2x.png 2x, vgo-why-3@3x.png 3x, vgo-why-3@4x.png 4x">
<p>
You might think, “of course <code>go</code> <code>get</code> makes this mistake:
it doesn’t know anything about versions at all.”
But most other systems make the same mistake.
I’m going to use Dep as my example here,
but at least Bundler and Cargo work the same way.
<p>
Dep asks every package to include a metadata file called a manifest,
which lists requirements for dependency versions.
When Dep downloads C,
it reads C’s manifest and learns that C needs D 1.4 or later.
Then Dep downloads the newest version of D satisfying that constraint.
Yesterday, that meant D 1.5:
<p>
<img name="vgo-why-9" class="center pad" width=201 height=60 src="vgo-why-9.png" srcset="vgo-why-9.png 1x, vgo-why-9@1.5x.png 1.5x, vgo-why-9@2x.png 2x, vgo-why-9@3x.png 3x, vgo-why-9@4x.png 4x">
<p>
Today, that means D 1.6:
<p>
<img name="vgo-why-10" class="center pad" width=201 height=60 src="vgo-why-10.png" srcset="vgo-why-10.png 1x, vgo-why-10@1.5x.png 1.5x, vgo-why-10@2x.png 2x, vgo-why-10@3x.png 3x, vgo-why-10@4x.png 4x">
<p>
The decision is time-dependent. It changes from day to day.
The build is not repeatable.
<p>
The developers of Dep (and Bundler and Cargo and ...)
understood the importance of repeatability,
so they introduced a second metadata file called a lock file.
If C is a whole program, what Go calls <code>package</code> <code>main</code>,
then the lock file lists the exact version to use
for every dependency of C,
and Dep lets the lock file
override the decisions it would normally make.
Locking in those decisions
ensures that they stop changing over time
and makes the build repeatable.
<p>
But lock files only apply to whole programs, to <code>package</code> <code>main</code>.
What if C is a library, being built as part of a larger program?
Then a lock file meant for building only C
might not satisfy the additional constraints in the larger program.
So Dep and the others must ignore lock files
associated with libraries
and fall back to the default time-based decisions.
When you add C 1.8 to a larger build,
the exact packages you get depends on
what day it is.
<p>
In summary, Dep starts with a time-based decision
about which version of D to use.
Then it adds a lock file,
to override that time-based decision, for repeatability,
but that lock file can only be applied to whole programs.
<p>
In Go modules, the <code>go</code> command instead makes its decision
about which version of D to use in a way that does not change over time.
Then builds are repeatable all the time, without the added complexity of a lock file override,
and this repeatability applies to libraries, not just whole programs.
<p>
The algorithm used for Go modules is very simple,
despite the imposing name “minimal version selection.”
It works like this.
Each package specifies a minimum version of each dependency.
For example, suppose B 1.3 requests D 1.3 or later,
and C 1.8 requests D 1.4 or later.
In Go modules, the <code>go</code> command prefers to use those exact versions, not the latest versions.
If we’re building B by itself, we’ll use D 1.3.
If we’re building C by itself, we’ll use D 1.4.
The builds of these libraries are repeatable.
<p>
<img name="vgo-why-12" class="center pad" width=270 height=178 src="vgo-why-12.png" srcset="vgo-why-12.png 1x, vgo-why-12@1.5x.png 1.5x, vgo-why-12@2x.png 2x, vgo-why-12@3x.png 3x, vgo-why-12@4x.png 4x">
<p>
Also shown in the figure,
if different parts of a build request different
minimum versions, the <code>go</code> command uses the latest requested version.
The build of A sees requests for D 1.3 and D 1.4,
and 1.4 is later than 1.3,
so the build chooses D 1.4.
That decision does not depend on whether D 1.5 and D 1.6 exist,
so it does not change over time.
<p>
I call this minimal version selection for two reasons.
First, for each package it selects the minimum version satisfying the requests (equivalently, the maximum of the requests).
And second, it seems to be just about the simplest approach that could possibly work.
<p>
Minimal version selection provides repeatability,
for whole programs and for libraries,
always, without any lock files.
It removes time from consideration.
Every chosen version is always one of the versions
mentioned explicitly by some package already chosen
for the build.
<a class=anchor href="#latest-feature"><h2 id="latest-feature">Objection: Using the Latest Version is a Feature</h2></a>
<p>
The usual first objection to prioritizing repeatability
is to claim that preferring the latest version
of a dependency is a feature, not a bug.
The claim is that programmers either don’t want to
or are too lazy to update their dependencies regularly,
so tools like Dep should use the latest dependencies
automatically.
The argument is that the benefits of having the
latest versions outweigh the loss of repeatability.
<p>
But this argument doesn’t hold up to scrutiny.
Tools like Dep provide lock files,
which then require programmers to update dependencies themselves,
exactly because repeatable builds are more important than using the latest version.
When you deploy a 1-line bug fix,
you want to be sure that your bug fix is the only change,
that you’re not also picking up
different, newer versions of your dependencies.
<p>
You want to delay upgrades until you ask for them,
so that you can be ready to run all your unit tests,
all your integration tests, and maybe even
production canaries, before you start using
those upgraded dependencies in production.
Everyone agrees about this.
Lock files exist because everyone agrees about this:
repeatability is more important than automatic upgrades.
<a class=anchor href="#latest-library"><h2 id="latest-library">Objection: Using the Latest Version is a Feature When Building a Library</h2></a>
<p>
The more nuanced argument you could
make against minimal version selection
would be to admit that repeatability matters for whole program builds,
but then argue that, for libraries, the balance is different, and
having the latest dependencies is more important than
a repeatable build.
<p>
I disagree.
As programming increasingly means connecting
large libraries together,
and those large libraries are increasingly organized
as collections of smaller libraries,
all the reasons to prefer repeatability of whole-program builds
become just as important for library builds.
<p>
The extreme limit of this trend is the recent move
in cloud computing to “serverless” hosting,
like Amazon Lambda, Google Cloud Functions,
or Microsoft Azure Functions.
The code we upload to those systems is
a library, not a whole program.
We certainly want the production builds on
those servers to use the same versions of
dependencies as on our development machines.
<p>
Of course, no matter what, it’s important to make it easy for
programmers to update their dependencies regularly.
We also need tools to report which versions
of a package are in a given build or a given binary,
including reporting when updates are available
and when there are known security problems in the
versions being used.
<a class=anchor href="#cooperation"><h2 id="cooperation">Principle #3: Cooperation</h2></a>
<blockquote>
<p>
<i>To maintain the Go package ecosystem, we must all work together.</i> <br>
<i>Tools cannot work around a lack of cooperation.</i></blockquote>
<p>
The third principle is cooperation.
We often talk about “the Go community”
and “the Go open source ecosystem.”
The words community and ecosystem emphasize that
all our work is interconnected,
that we’re building on—depending on—each other’s contributions.
The goal is one unified system that works as a coherent whole.
The opposite, what we want to avoid,
is an ecosystem that is fragmented,
split into groups of packages that can’t work with each other.
<p>
The principle of cooperation recognizes that the only way
to keep the ecosystem healthy and thriving
is for us all to work together.
If we don’t,
then no matter how technically sophisticated our tools are,
the Go open source ecosystem is guaranteed to fragment and eventually fail.
By implication, then, it’s okay if fixing incompatibilities requires cooperation.
We can’t avoid cooperation anyway.
<p>
For example, once again we have C 1.8,
which requires D 1.4 or later.
Thanks to repeatability,
a build of C 1.8 by itself will use D 1.4.
If we build C as part of a larger build that needs D 1.5, that’s okay too.
<p>
Then D 1.6 is released, and some larger build,
maybe continuous integration testing,
discovers that C 1.8 does not work with D 1.6.
<p>
<img name="vgo-why-13" class="center pad" width=306 height=125 src="vgo-why-13.png" srcset="vgo-why-13.png 1x, vgo-why-13@1.5x.png 1.5x, vgo-why-13@2x.png 2x, vgo-why-13@3x.png 3x, vgo-why-13@4x.png 4x">
<p>
No matter what, the solution is for C’s author and D’s author to cooperate
and release a fix.
The exact fix depends on what exactly went wrong.
<p>
Maybe C depends on buggy behavior fixed in D 1.6,
or maybe C depends on unspecified behavior changed in D 1.6.
Then the solution is for C’s author to release a new C version 1.9,
cooperating with the evolution of D.
<p>
<img name="vgo-why-15" class="center pad" width=297 height=130 src="vgo-why-15.png" srcset="vgo-why-15.png 1x, vgo-why-15@1.5x.png 1.5x, vgo-why-15@2x.png 2x, vgo-why-15@3x.png 3x, vgo-why-15@4x.png 4x">
<p>
Or maybe D 1.6 simply has a bug.
Then the solution is for D’s author to release a fixed D 1.7,
cooperating by respecting the principle of compatibility,
at which point C’s author can release C version 1.9 that
specifies that it requires D 1.7.
<p>
<img name="vgo-why-14" class="center pad" width=297 height=130 src="vgo-why-14.png" srcset="vgo-why-14.png 1x, vgo-why-14@1.5x.png 1.5x, vgo-why-14@2x.png 2x, vgo-why-14@3x.png 3x, vgo-why-14@4x.png 4x">
<p>
Take a minute to look at what just happened.
The latest C and the latest D didn’t work together.
That introduced a small fracture in the Go package ecosystem.
C’s author or D’s author worked to fix the bug,
cooperating with each other
and the rest of the ecosystem
to repair the fracture.
This cooperation is essential to keeping the ecosystem healthy.
There is no adequate technical substitute.
<p>
The repeatable builds in Go modules mean that a buggy D 1.6
won’t be picked up until users explicitly ask to upgrade.
That creates time for C’s author and D’s author to cooperate
on a real solution.
The Go modules system makes no other attempt to work around
these temporary incompatibilities.
<a class=anchor href="#sat"><h2 id="sat">Objection: Use Declared Incompatibilities and SAT Solvers</h2></a>
<p>
The most common objection to this approach of depending on
cooperation is that it is unreasonable to expect developers to cooperate.
Developers need some way to fix problems alone.
the argument goes: they can only truly depend on themselves,
not others.
The solution offered by package managers
like Bundler, Cargo, and Dep
is to allow developers to declare
incompatibilities between their packages and others
and then employ a <a href="https://research.swtch.com/version-sat">SAT solver</a>
to find a package combination not ruled out by the constraints.
<p>
This argument breaks down for a few reasons.
<p>
First, the <a href="https://research.swtch.com/vgo-mvs">algorithm used by Go modules</a>
to select versions
already gives the developer of a particular module complete control
over which versions are selected for that module,
more control in fact than SAT constraints.
The developer can force the use of any specific version of any dependency,
saying “use this exact version no matter what anyone else says.”
But that power is limited to the build of that specific module,
to avoid giving other developers the same control over your builds.
<p>
Second, repeatability of library builds in Go modules
means that the release of a new, incompatible version of a dependency
has no immediate effect on builds, as we saw in the previous section.
The breakage only surfaces when someone takes some step to add
that version to their own build, at which point they can step back again.
<p>
Third, if version selection is phrased as a problem for a SAT solver,
there are often many possible satisfying selections:
the SAT solver must choose between them,
and there is no clear criteria for doing so.
As we saw earlier, SAT-based package managers
choose between multiple valid possible selections
by preferring newer versions.
In the case where using the newest version of everything
satisfies the constraints, that’s the clear “most preferred” answer.
But what if the two possible selections are
“latest of B, older C” and “older B, latest of C”?
Which should be preferred?
How can the developer predict the outcome?
The resulting system is difficult to understand.
<p>
Fourth, the output of a SAT solver is only as good as its inputs:
if any incompatibilities have been omitted, the SAT solver
may well arrive at a combination that is still broken,
just not declared as such.
Incompatibility information is likely to be particularly
incomplete for combinations involving dependencies
with a significant age difference that may well never have
been put together before.
Indeed, an analysis of Rust’s Cargo ecosystem in 2018
found that Cargo’s preference for the latest version was
<a href="https://illicitonion.blogspot.com/2018/06/rust-minimum-versions-semver-is-lie.html">masking many missing constraints</a>
in Cargo manifests.
If the latest version does not work,
exploring old versions seems as likely to produce
a combination that is “not yet known to be broken”
as it is to produce a working one.
<p>
Overall, once you step off the happy path of selecting the
newest version of every dependency,
SAT solver-based package managers are not more
likely to choose a working configuration than
Go modules is.
If anything, SAT solvers may well be less likely
to find a working configuration.
<a class=anchor href="#sat-example"><h2 id="sat-example">Example: Go Modules versus SAT Solving</h2></a>
<p>
The counter-arguments given in the previous section are a bit abstract.
Let’s make them concrete by continuing the example we’ve been
working with and looking at what happens when using a SAT solver,
like in Dep.
I’m using Dep for concreteness, because
it is the immediate predecessor of Go modules,
but the behaviors here are not specific to Dep
and I don’t mean to single it out.
For the purposes of this example,
Dep works the same way as many other
package managers, and they all share the problems detailed here.
<p>
To set the stage, remember that C 1.8 works fine
with D 1.4 and D 1.5, but the
combination of C 1.8 and D 1.6 is broken.
<p>
<img name="vgo-why-13" class="center pad" width=306 height=125 src="vgo-why-13.png" srcset="vgo-why-13.png 1x, vgo-why-13@1.5x.png 1.5x, vgo-why-13@2x.png 2x, vgo-why-13@3x.png 3x, vgo-why-13@4x.png 4x">
<p>
That gets noticed, perhaps by continuous integration testing,
and the question is what happens next.
<p>
When C’s author finds out
that C 1.8 doesn’t work with D 1.6,
Dep allows and encourages issuing a new version, C 1.9.
C 1.9 documents that it needs D later than 1.4 but before 1.6.
The idea is that documenting the incompatibility
helps Dep avoid it in future builds.
<p>
<img name="vgo-why-16" class="center pad" width=323 height=130 src="vgo-why-16.png" srcset="vgo-why-16.png 1x, vgo-why-16@1.5x.png 1.5x, vgo-why-16@2x.png 2x, vgo-why-16@3x.png 3x, vgo-why-16@4x.png 4x">
<p>
In Dep, avoiding the incompatibility is important—even urgent!—because
the lack of repeatability in library builds means
that as soon as D 1.6 is released,
all future fresh builds of C will use D 1.6 and break.
This is a build emergency: all of C’s new users are broken.
If D’s author is unavailable,
or C’s author doesn’t have time to fix the actual bug,
the argument is that C’s author
must be able to take some step to protect users from the breakage.
That step is to release C 1.9,
documenting the incompatibility with D 1.6.
That fixes new builds of C by preventing the use of D 1.6.
<p>
This emergency doesn’t happen when using Go modules,
because of minimal version selection and repeatable builds.
Using Go modules, the release of D 1.6 does not affect C’s users,
because nothing is explicitly requesting D 1.6 yet.
Users keep using the older versions of D they already use.
There’s no need to document the incompatibility,
because nothing is breaking.
There’s time to cooperate on a real fix.
<p>
Looking at Dep’s approach of documenting incompatibility again,
releasing C 1.9 is not a great solution.
For one thing,
the premise was that D’s author
created a build emergency by releasing D 1.6
and then was unavailable to release a fix,
so it was important to give C’s author a way to fix things,
by releasing C 1.9.
But if D’s author might be unavailable,
what happens if C’s author is unavailable too?
Then the emergency caused by automatic upgrades
continues and all of C’s new users stay broken.
Repeatable builds in Go modules avoid the emergency entirely.
<p>
Also, suppose that the bug is in D,
and D’s author issues a fixed D 1.7.
The workaround C 1.9 requires D before 1.6,
so it won’t use the fixed D 1.7.
C’s author has to issue C 1.10
to allow use of D 1.7.
<p>
<img name="vgo-why-17" class="center pad" width=317 height=130 src="vgo-why-17.png" srcset="vgo-why-17.png 1x, vgo-why-17@1.5x.png 1.5x, vgo-why-17@2x.png 2x, vgo-why-17@3x.png 3x, vgo-why-17@4x.png 4x">
<p>
In contrast, if we’re using Go modules,
C’s author doesn’t have to issue C 1.9
and then also doesn’t have to undo it by issuing C 1.10.
<p>
In this simple example,
Go modules end up working more smoothly for users than Dep.
They avoid the build breakage automatically,
creating time for cooperation on the real fix.
Ideally, C or D gets fixed before any of C’s users even notice.
<p>
But what about more complex examples?
Maybe Dep’s approach of documenting incompatibilities
is better in more complicated situations,
or maybe it keeps things working
even when the real fix takes a long time to arrive.
<p>
Let’s take a look.
To do that, let’s rewind the clock a little,
to before the buggy D 1.6 is released,
and compare the decisions made by Dep and Go modules.
This figure shows the documented requirements for all
the relevant package versions,
along with the way both Dep and Go modules
would build the latest C and the latest A.
<p>
<img name="vgo-why-19" class="center pad" width=383 height=270 src="vgo-why-19.png" srcset="vgo-why-19.png 1x, vgo-why-19@1.5x.png 1.5x, vgo-why-19@2x.png 2x, vgo-why-19@3x.png 3x, vgo-why-19@4x.png 4x">
<p>
Dep is using D 1.5
while the Go module system is using D 1.4,
but both tools have found working builds.
Everyone is happy.
<p>
But now suppose the buggy D 1.6 is released.
<p>
<img name="vgo-why-20" class="center pad" width=383 height=270 src="vgo-why-20.png" srcset="vgo-why-20.png 1x, vgo-why-20@1.5x.png 1.5x, vgo-why-20@2x.png 2x, vgo-why-20@3x.png 3x, vgo-why-20@4x.png 4x">
<p>
Dep builds pick up D 1.6 automatically and break.
Go modules builds keep using D 1.4 and keep working.
This is the simple situation we were just looking at.
<p>
Before we move on, though, let’s fix the Dep builds.
We release C 1.9,
which documents the incompatibility with D 1.6:
<p>
<img name="vgo-why-21" class="center pad" width=383 height=270 src="vgo-why-21.png" srcset="vgo-why-21.png 1x, vgo-why-21@1.5x.png 1.5x, vgo-why-21@2x.png 2x, vgo-why-21@3x.png 3x, vgo-why-21@4x.png 4x">
<p>
Now Dep builds pick up C 1.9 automatically,
and builds start working again.
Go modules can’t document incompatibility in this way,
but Go modules builds also aren’t broken,
so no fix is needed.
<p>
Now let’s create a build complex enough to break Go modules.
We can do this in two steps.
First, we will release a new B that requires D 1.6.
Second, we will release a new A that requires the new B,
at which point A’s build will use C with D 1.6 and break.
<p>
We start by releasing the new B 1.4 that requires D 1.6.
<p>
<img name="vgo-why-22" class="center pad" width=473 height=270 src="vgo-why-22.png" srcset="vgo-why-22.png 1x, vgo-why-22@1.5x.png 1.5x, vgo-why-22@2x.png 2x, vgo-why-22@3x.png 3x, vgo-why-22@4x.png 4x">
<p>
Go modules builds are unaffected so far, thanks to repeatability.
But look! Dep builds of A pick up B 1.4 automatically and now
they are broken again.
What happened?
<p>
Dep prefers to build A using the latest B and the latest C,
but that’s not possible:
the latest B wants D 1.6 and the latest C wants D before 1.6.
But does Dep give up? No.
It looks for alternate versions of B and C
that do agree on an acceptable D.
<p>
In this case, Dep decided to keep the latest B,
which means using D 1.6,
which means <i>not</i> using C 1.9.
Since Dep can’t use the latest C, it tries older versions of C.
C 1.8 looks good: it says it needs D 1.4 or later,
and that allows D 1.6.
So Dep uses C 1.8, and it breaks.
<p>
<i>We</i> know that C 1.8 and D 1.6 are incompatible,
but Dep does not.
Dep can’t know it, because C 1.8 was released before D 1.6:
C’s author couldn’t have predicted that D 1.6 would be a problem.
And all package management systems agree that
package contents must be immutable once
they are published,
which means there’s no way for C’s author
to retroactively document that C 1.8 doesn’t work with D 1.6.
(And if there were some way to change C 1.8’s requirements
retroactively, that would violate repeatability.)
Releasing C 1.9 with the updated requirement was the fix.
<p>
Because Dep prefers to use the latest C,
most of the time it will use C 1.9 and know to avoid D 1.6.
But if Dep can’t use the latest of everything,
it will start trying earlier versions of some things,
including maybe C 1.8.
And using C 1.8 makes it look like D 1.6 is okay—even though we know better—and the build breaks.
<p>
Or it might not break.
Strictly speaking, Dep didn’t have to make that decision.
When Dep realized that it couldn’t use both the latest B and the latest C,
it had many options for how it might proceed.
We assumed Dep kept the latest B.
But if instead Dep kept the latest C,
then it would need to use an older D and then an older B,
producing a working build, as shown in the third column of the diagram.
<p>
So maybe Dep’s builds are broken or maybe not,
depending on the arbitrary decisions it makes in its
<a href="https://research.swtch.com/version-sat">SAT-solver-based version selection</a>.
(Last I checked, given a choice between a newer version of one package
versus another, Dep prioritizes the one with the alphabetically earlier import path,
at least in small test cases.)
<p>
This example demonstrates another way that Dep and systems like it
(nearly all package managers besides Go modules)
can produce surprising results:
when the one most preferred answer (use the latest of everything)
does not apply, there are often many choices with no
clear preferences between them.
The exact answer depends on the details of the SAT solving algorithm,
heuristics, and often the input order of the packages are presented to the solver.
This underspecification and non-determinism in their solvers
is another reason these systems need lock files.
<p>
In any event, for the sake of Dep users, let’s assume Dep lucked
into the choice that keeps builds working.
After all, we’re still trying to break the Go modules users’ builds.
<p>
To break Go modules builds, let’s release a new version of A, version 1.21,
which requires the latest B,
which in turn requires the latest D.
Now, when the <code>go</code> command builds the latest A,
it is forced to use the latest B and the latest D.
In Go modules, there is no C 1.9,
so the <code>go</code> command uses C 1.8,
and the combination of C 1.8 and D 1.6 does not work.
Finally, we have broken the Go modules builds!
<p>
<img name="vgo-why-23" class="center pad" width=339 height=270 src="vgo-why-23.png" srcset="vgo-why-23.png 1x, vgo-why-23@1.5x.png 1.5x, vgo-why-23@2x.png 2x, vgo-why-23@3x.png 3x, vgo-why-23@4x.png 4x">
<p>
But look!
The Dep builds are using C 1.8 and D 1.6 too,
so they’re also broken.
Before, Dep had to make a choice between the latest B and the latest C.
If it chose the latest B, the build broke.
If it chose the latest C, the build worked.
The new requirement in A is forcing Dep to choose the latest B
and the latest D, taking away Dep’s choice of latest C.
So Dep uses the older C 1.8,
and the build breaks just like before.
<p>
What should we conclude from all this?
First of all, documenting an incompatibility for Dep
does not guarantee to avoid that incompatibility.
Second, a repeatable build like in Go modules
also does not guarantee to avoid the incompatibility.
Both tools can end up building the incompatible pair of packages.
But as we saw,
it takes multiple intentional steps
to lead Go modules to a broken build,
steps that lead Dep to the same broken build.
And along the way the Dep-based build broke two other times
when the Go modules build did not.
<p>
I’ve been using Dep in these examples because
it is the immediate predecessor of Go modules,
but I don’t mean to single out Dep.
In this respect, it works the same way as nearly every
other package manager in every other language.
They all have this problem.
They’re not even really broken or misbehaving so much as unfortunately designed.
They are designed to try to work around a lack of cooperation
among the various package maintainers,
and <i>tools cannot work around a lack of cooperation</i>.
<p>
The only real solution for the C versus D incompatibility
is to release a new, fixed version of either C or D.
Trying to avoid the incompatibility is useful
only because it creates more time for
C’s author and D’s author to cooperate on a fix.
Compared to the Dep approach of preferring latest versions
and documenting incompatibilities,
the Go modules approach of repeatable builds
with minimal version selection
and no documented incompatibilities
creates time for cooperation automatically,
with no build emergencies,
no declared incompatibilities,
and no explicit work by users.
<p>
Then we can rely on cooperation for the real fix.
<a class=anchor href="#conclusion"><h2 id="conclusion">Conclusion</h2></a>
<p>
These are the three principles of versioning in Go,
the reasons that the design of Go modules
deviates from the design of Dep, Cargo, Bundler, and others.
<ul>
<li>
<i>Compatibility.</i> The meaning of a name in a program should not change over time.
<li>
<i>Repeatability.</i> The result of a build of a given version of a package should not change over time.
<li>
<i>Cooperation.</i> To maintain the Go package ecosystem, we must all work together.
Tools cannot work around a lack of cooperation.</ul>
<p>
These principles are motivated by concerns about software engineering,
which is what happens to programming
when you add time and other programmers.
Compatibility eliminates the effects of time
on the meaning of a program.
Repeatability eliminates the effects of time
on the result of a build.
Cooperation is an explicit recognition that,
no matter how advanced our tools are,
we do have to work with the other programmers.
We can’t work around them.
<p>
The three principles also reinforce each other, in a virtuous cycle.
<p>
Compatibility enables a new version selection algorithm,
which provides repeatability.
Repeatability makes sure that buggy, new releases
are ignored until explicitly requested,
which creates more time to cooperate on fixes.
That cooperation in turn reestablishes compatibility.
And the cycle goes around.
<p>
As of Go 1.13, Go modules are ready for production use,
and many companies, including Google, have adopted them.
The Go 1.14 and Go 1.15 releases will bring additional
ergonomic improvements, toward eventually deprecating
and removing support for GOPATH.
For more about adopting modules, see the blog post
series on the Go blog, starting with
“<a href="https://blog.golang.org/using-go-modules">Using Go Modules</a>.”
How Many Go Developers Are There?
tag:research.swtch.com,2012:research.swtch.com/gophercount
2017-07-13T13:00:00-04:00
2019-11-01T14:02:00-04:00
At least a million, maybe two.
<p>
How many Go developers are there in the world? <i>At least a million, maybe two!</i>
<p>
As of November 2019, my best estimate is between 1.15 and 1.96 million.
<p>
Previously:<br>
In July 2017, I estimated between half a million and a million.<br>
In July 2018, I estimated between 0.8 and 1.6 million.
<a class=anchor href="#approach"><h2 id="approach">Approach</h2></a>
<p>
My approach is to compute:
<div class=fig>
<center>
<i>Number of Go Developers </i> = <i>Number of Software Developers</i> × <i>Fraction using Go</i>
</center>
</div>
<p>
Then we need to answer how many software developers there are in the world and what percentage of them are using Go.
<a class=anchor href="#number_of_software_developers"><h2 id="number_of_software_developers">Number of Software Developers</h2></a>
<p>
How many software developers are there in the world?
<p>
In January 2014, <a href="https://www.infoq.com/news/2014/01/IDC-software-developers">InfoQ reported</a> that IDC published a report (no longer available online, it would seem) estimating that there were 11,005,000 “professional software developers” and 7,534,500 “hobbyist software developers,” giving a total estimate of 18,539,500.
<p>
In October 2016, Evans Data Corporation issued a <a href="https://evansdata.com/press/viewRelease.php?pressID=244">press release</a> advertising their “<a href="https://web.archive.org/web/20160730050104/http://www.evansdata.com/reports/viewRelease.php?reportID=9">Global Developer Population and Demographic Study 2016</a>” in which they estimated the total worldwide population of software developers to be 21 million.
<p>
Maybe the Evans estimate is too high. The details of their methodology are key to their business and therefore not revealed publicly, so we can’t easily tell how strict or loose their definition of developer is.
In January 2017, PK of the DQYDJ blog posted an analysis titled “<a href="https://dqydj.com/number-of-developers-in-america-and-per-state/">How Many Developers are There in America, and Where Do They Live?</a>,” That post, which includes an admirably detailed methodology section, used data from the 2016 American Census Survey (ACS) and included these employment categories as “strict” software developers:
<ul>
<li>
Computer Scientists and Systems Analysts / Network Systems Analysts / Web Developers
<li>
Computer Programmers
<li>
Software Developers, Applications and Systems Software
<li>
Database Administrators</ul>
<p>
Using that list, PK arrived at a total of 3,357,626 software developers in the United States.
The post then added two less strict categories,
which expanded the total to 4,185,114 software developers.
A conservative estimate of the number of software developers worldwide would probably
include only PK’s “strict” category, about 80% of the United States total.
If we assume conservatively that the Evans estimate was similarly loose
and that the 80% ratio holds worldwide,
we can make the Evans estimate stricter by multiplying it by the same 80%, arriving at 16.8 million.
<p>
Maybe the Evans estimate is too low.
In May 2017, RedMonk blogger <a href="http://redmonk.com/jgovernor/2017/05/26/just-how-many-darned-developers-are-there-in-the-world-github-is-puzzled/">James Governor reported</a> that, in a recent speech, GitHub CEO Chris Wanstrath claimed GitHub has 21 million active users and GitHub’s Atom editor has 2 million active users (who haven’t turned off metrics and tracking) and concluded that the IDC and Evans estimates are therefore too low. Governor went on to give a “wild assed guess” of 35 million developers worldwide.
<p>
<i>Based on all this (and ignoring wild-assed guesses), my estimate in July 2017 was that the number of software developers worldwide was likely to be in the range 16.8–21 million.</i>
<p>
The <a href="https://evansdata.com/press/viewRelease.php?pressID=268">2018 Evans Data Global Developer Population and Demographic Study</a> estimated 23 million developers worldwide, up from 21 million in the 2016 survey.
<p>
<i>Based on the confidence range in 2017 being 16.8–21 million, my estimate in July 2018 applied the Evans Data 10% growth to arrive at 18.4–23 million developers in 2018.</i>
<p>
The <a href="https://evansdata.com/press/viewRelease.php?pressID=278">2019 Evans Data Global Development Survey</a> estimated 23.9 million developers worldwide in May 2019, up from 23 million in the 2018 survey.
<p>
The <a href="https://www.idc.com/getdoc.jsp?containerId=US44363318">IDC Worldwide Developer Census, 2018</a> estimated 22.30 million developers worldwide, up from <a href="https://www.idc.com/getdoc.jsp?containerId=US44389218">21.25 million in 2017</a>.
<p>
SlashData’s <a href="https://slashdata-website-cms.s3.amazonaws.com/sample_reports/EiWEyM5bfZe1Kug_.pdf">Global Developer Population 2019</a> estimates 14.7 million developers in Q2 2017, 15.7 million in Q4 2017, 16.9 million in Q2 2018, and 18.9 million in Q4 2018, extrapolating to “at least 21M developers by the end of 2019 and possibly upwards of 23M.” Oddly, SlashData’s <a href="https://www.slashdata.co/free-resources/state-of-the-developer-nation-17th-edition">Developer Economics State of the Developer Nation 17th Edition</a> estimated “18 million active software developers in the world” as of Q2 2019. It is unclear if this was rounded down from the 18.9, if SlashData believes the number of developers decreased in the first half of 2019, or if the numbers are completely unrelated.
<p>
<i>Based on these, my estimate in November 2019 is that the number of developers worldwide is likely to be in the range 18.9–23.9 million developers.</i>
<a class=anchor href="#fraction_using_go"><h2 id="fraction_using_go">Fraction using Go</h2></a>
<p>
What fraction of software developers use Go?
<p>
Stack Overflow has been running an annual developer survey for the past few years. In their <a href="https://insights.stackoverflow.com/survey/2017#technology">2017 survey</a>, 4.2% of all respondents and 4.6% of professional developers reported using Go. Unfortunately, we cannot sanity check Go against the year before, because the <a href="https://insights.stackoverflow.com/survey/2016#technology">2016 survey report</a> cut off the list of popular technologies after Objective-C (6.5%).
<p>
O’Reilly has been running annual software developer salary surveys for the past few years as well, and their survey asks about language use.
The <a href="https://www.oreilly.com/ideas/2016-software-development-salary-survey-report#id-kAiqtmuE">2016 worldwide survey</a> reports that Go is used by 3.9% of respondents, while the <a href="https://www.oreilly.com/ideas/2016-european-software-development-salary-survey#id-4xiji1CA">2016 European survey</a> reports Go is used by 3.3% of respondents. (I derived both these numbers by measuring the bars in the graphs.) The <a href="https://www.oreilly.com/ideas/2017-software-development-salary-survey#tools">2017 worldwide survey</a> reports in commentary that 4.5% of respondents say they use Go.
<p>
Maybe the 4.2–4.6% estimate is too high.
Both of these are online surveys with a danger of self-selection bias among respondents,
and that developers who don’t answer online surveys use Go much less than those who do.
For example, perhaps the surveys are skewed by geography or experience
in a way that affects the result.
Suppose, I think quite pessimistically, that the survey respondents
are only representative of half of the software developer population,
and that in the other half, Go is only half as popular.
Then if a survey found 4.2% of developers use Go, the real percentage would be three quarters as much, or 3.15%.
<p>
<i>Based on all this, my estimate in July 2017 was that the fraction of software developers using Go worldwide is at least 3% and possibly as high as 4.6%.</i>
<p>
The <a href="https://insights.stackoverflow.com/survey/2018/#most-popular-technologies">2018 Stack Overflow Developer Survey</a> reports that 7.1% of all developers and 7.2% of professional developers use Go. Applying the same pessimism as last year (multiplying by three quarters) suggests a lower bound of 5.3%, but I’ll be even more conservative and use last year’s 4.6% as a lower bound. (The O’Reilly surveys seem to have stopped being run.)
<p>
<i>Based on this, my estimate in July 2018 was that 4.6–7.1% of developers use Go.</i>
<p>
The <a href="https://insights.stackoverflow.com/survey/2019/#most-popular-technologies">2019 Stack Overflow Developer Survey</a> reports that 8.2% of all developers and 8.8% of professional developers use Go (compared with 7.1% and 7.2% in 2017).
<p>
The <a href="https://research.hackerrank.com/developer-skills/2019#skills2">HackerRank 2019 Developer Skills Report</a> reports that 8.8% of respondents know Go at the end of 2018 (compared with 6.08% at the end of 2017).
<p>
SlashData’s <a href="https://www.slashdata.co/free-resources/state-of-the-developer-nation-17th-edition">Developer Economics State of the Developer Nation 17th Edition</a> estimated 1.1 million active Go software developers, out of 18 million worldwide, or 6.15%. (That happens to match my usual “three quarters of Stack Overflow” conservative lower bound exactly.)
<p>
The <a href="https://www.jetbrains.com/lp/devecosystem-2019/">JetBrains Developer Ecosystem Survey 2019</a> reports that 18% of developers used Go in 2018. That number seems too high, so I will discount it for now.
<p>
Evans Data Corporation’s Global Development Survey reports even higher (and harder to believe) percentages for Go usage in the non-public part of the survey.
<p>
<i>Based on all this, my estimate in November 2019 is that 6.1–8.2% of developers use Go.</i>
<a class=anchor href="#number_of_go_developers"><h2 id="number_of_go_developers">Number of Go Developers</h2></a>
<p>
How many Go developers are there? Multiply the low developer count and Go percentages and the high ones.
<p>
In July 2017, 3–4.6% of 16.8–21 million yielded an estimate of 0.50–0.97 million Go developers.
<p>
In July 2018, 4.6–7.1% of 18.4–23 million yielded an estimate of 0.85–1.63 million Go developers.
<p>
In November 2019, 6.1–8.2% of 18.9–23.9 million yields an estimate of 1.15–1.96 million Go developers.
Go Proposal Process: Representation
tag:research.swtch.com,2012:research.swtch.com/proposals-representation
2019-10-03T11:45:00-04:00
2019-10-03T11:47:00-04:00
How do we increase user representation in the proposal process, and what does that mean? (Go Proposals, Part 6)
<p>
[<i>I’ve been thinking a lot recently about the
<a href="https://golang.org/s/proposal">Go proposal process</a>,
which is the way we propose, discuss, and decide
changes to Go itself.
Like <a href="https://blog.golang.org/experiment">nearly everything about Go</a>,
the proposal process is an experiment,
so it makes sense to reflect on what we’ve
learned and try to improve it.
This post is the sixth in <a href="https://research.swtch.com/proposals">a series of posts</a>
about what works well and,
more importantly,
what we might want to change.</i>]
<a class=anchor href="#who"><h2 id="who">Who is Represented?</h2></a>
<p>
At the contributor summit, we considered the question of who
is well represented or over-represented in the Go development
process and who is under-represented.
<p>
The question of who is well represented matters because
diverse representation produces diverse viewpoints
that can help us reach better overall decisions in the process.
We cannot possibly hear from every single person with relevant input;
instead we can try to hear from enough people that we still
gather all the important viewpoints and observations.
<p>
On the well represented or possibly over-represented list,
we have members of the Go team;
GitHub users who have time to keep up with discussions;
English-speaking users;
people who keep up with tech social media on sites like Hacker News and Twitter;
and people who attend and give talks at Go conferences.
On the under-represented list,
we have non-GitHub users or users who can’t keep up with GitHub discussions;
non-English-speaking users;
“heads down” users, meaning anyone who spends their time writing code
to the exclusion of engaging on social media or attending Go conferences;
business users in general;
users with non-computer science backgrounds;
users with non-programming language backgrounds;
and non-users, people not using Go at all.
As we consider ways to make the proposal process more accessible
to more users and potential users, it is worth keeping these lists in mind
to check whether new groups are being reached.
<p>
The <a href="https://golang.org/s/proposal-minutes">proposal minutes</a>
help reach users who are on GitHub but can’t keep up with all the discussions,
by providing a single issue to star and get roughly weekly updates about
which proposals are under active discussion and which are close to
being accepted or declined.
The minutes are an improvement for the “can’t keep up with GitHub” category,
but not for the other categories.
<p>
Reaching non-English-speaking users is one of the most difficult challenges.
We on the Go team have attended Go conferences around the world to meet users
in many countries; at many conferences there is simultaneous translation for
the talks, which is wonderful. But there isn’t simultaneous translation for our
proposal discussions. I don’t know whether the most significant proposals,
like the latest version of the generics proposal, have been translated by users
into their native languages, or if non-English-speakers muddle through with automatic
translation, or something else. Twice in the past we’ve had questions about
proposals that primarily affected Chinese users—specifically, whether to change
case-based export for uncased languages and whether to build a separate
Go distribution with different Go proxy defaults for China.
In both these cases, we asked Asta Xie, the organizer of Gophercon China,
to run a quick poll of users in a Chinese social media group of Go users.
That was very helpful, but that doesn’t scale to all proposals.
<p>
Reaching “heads down” programmers and business users,
those not attending Go conferences or engaging in Go-related social media,
probably means branching out to more non-Go-specific places
to publicize the most important Go changes, and possibly Go itself.
<p>
The final group is users with non-computer science or non-programming language backgrounds.
Go proposals are usually written assuming significant familiarity with Go
and often also familiarity with computer science or programming language concepts.
In general, that’s more efficient than the alternative.
But especially for large changes it would help to have alternate forms that are
accessible to a larger audience.
Ideas include longer tutorials or walkthroughs,
short video demos,
and recorded video-based Q&A sessions.
<a class=anchor href="#what"><h2 id="what">What Does Representation Mean?</h2></a>
<p>
Part of Go’s appeal for me as a user, and I think for many Go users,
is the fact that it feels like a coherent system in which the pieces
fit together and complement each other well.
In my 2015 talk and blog post, “<a href="https://blog.golang.org/open-source">Go, Open Source Community</a>,”
I said that “one of the most important things
we the original authors of Go can offer is consistency of vision, to help keep Go Go.”
I still believe that consistency of vision, to keep Go itself consistent, simple, and understandable,
remains critically important to Go’s success.
<p>
In an <a href="https://www.itworld.com/article/2826125/the-future-according-to-dennis-ritchie--a-2000-interview-.html?page=2">interview with IT World in 2000</a>, Dennis Ritchie (the creator of C) was asked about control over C’s development. He said:<blockquote>
<p>
On the other hand, the “open evolution” idea has its own drawbacks,
whether in official standards bodies or more informally,
say over the Web or mailing lists.
When I read commentary about suggestions for where C should go,
I often think back and give thanks that it wasn’t developed
under the advice of a worldwide crowd.
C is peculiar in a lot of ways, but it, like many other successful things,
has a certain unity of approach that stems from development in a small group.</blockquote>
<p>
I see much of that same unity of approach in Go’s design by a small group.
The flip side is that if you have a small number of people making decisions alone,
they can’t know enough about the million or more Go developers and their uses
to anticipate all the important use cases and needs.
That is, while a small number of ultimate designers provides
consistency of vision, it can just as well result in a failure of vision,
in which a design fundamentally fails to plan for or adapt to
a requirement that becomes important later but was not seen or well understood at the time.
<p>
A decision can be <i>both</i> elegant for its time and short-sighted for the future.
For example, see the recent crtique of Unix’s <i>fork</i> system call, “<a href="https://www.microsoft.com/en-us/research/uploads/prod/2019/04/fork-hotos19.pdf">A <code>fork()</code> in the road</a>,” by Andrew Baumann <i>et</i> <i>al</i>., from HotOS ’19.
In fact, nearly all designs end up being short-sighted given a long enough time scale.
We’d be fortunate for all our designs to last the 50 years that <i>fork</i> has.
The most important part is to guard against short-term failures of vision, not very long-term ones.
<p>
To me, the most important place for broad representation and inclusion
is in the proposals and discussions, because, as I said above,
diverse representation produces diverse viewpoints
that can help us reach better overall decisions in the process.
It can help us avoid at least the short-term and hopefully medium-term failures of vision.
At the same time, I hope we can maintain a long-term consistency of vision
in the design of Go, by the continued active involvement of the original designers.
It seems to me that having both the original designers and a diverse set of other voices
in our proposal discussions and <a href="proposals-clarity#decisions">consistently working toward consensus decisions</a>
will lead to the best outcomes and balances the desire for a consistency of vision
against the need to avoid failures of vision.
<a class=anchor href="#next"><h2 id="next">Next</h2></a>
<p>
Again, this is the sixth post <a href="proposals">in a series of posts</a>
thinking and brainstorming about the Go proposal process.
Everything about these posts is very rough.
The point of posting this series—thinking out loud
instead of thinking quietly—is so that
anyone who is interested can join the thinking.
<p>
I encourage feedback, whether in the form of comments on these posts,
comments on the newly filed issues,
mail to rsc@golang.org,
or your own blog posts (please leave links in the comments).
Thanks for taking the time to read these and think with me.
<p>
The next post will be about how best to coordinate efforts
across the Go community and ecosystem.
Go Proposal Process: Enabling Experiments
tag:research.swtch.com,2012:research.swtch.com/proposals-experiment
2019-09-23T14:00:00-04:00
2019-09-23T14:02:00-04:00
How can we balance stability and experimentation? (Go Proposals, Part 5)
<p>
[<i>I’ve been thinking a lot recently about the
<a href="https://golang.org/s/proposal">Go proposal process</a>,
which is the way we propose, discuss, and decide
changes to Go itself.
Like <a href="https://blog.golang.org/experiment">nearly everything about Go</a>,
the proposal process is an experiment,
so it makes sense to reflect on what we’ve
learned and try to improve it.
This post is the fifth in <a href="https://research.swtch.com/proposals">a series of posts</a>
about what works well and,
more importantly,
what we might want to change.</i>]
<p>
Communicating a proposed change precisely and clearly is difficult,
on both sides of the communication.
Technical details
are easy to write and to read incorrectly without realizing it,
and implications are easy to misunderstand.
After all, this is why our programs are filled with bugs.
The best way I know to address this problem for a large change
is to implement it and try it.
(See my GopherCon talk, “<a href="https://blog.golang.org/experiment">Experiment, Simplify, Ship</a>.”)
<p>
Being able to try out a possible new feature,
whether it is in the design draft or the final proposal stage,
is extremely helpful for understanding the feature.
Understanding the feature is in turn critical for being able
to give meaningful, concrete feedback.
Anything we can do to help everyone (including the authors)
understand proposals better sooner is a way to improve the overall process.
<a class=anchor href="#proto"><h2 id="proto">Prototypes</h2></a>
<p>
Multiple contributors at the summit brought up the Go modules proposal
as an example of how much it helped to have a working prototype:
being able to learn about modules by trying <code>vgo</code>
was very helpful for them,
instead of having to imagine the experience by reading documents alone.
There is a balance to be struck here.
It certainly helped to have <code>vgo</code> available for the initial public discussion,
but other problems were caused by waiting until then to discuss
the ideas publicly.
We published the design drafts last summer without working
prototypes specifically to avoid that mistake, of discussing ideas too late.
But, when we get farther along,
especially with generics,
it will also be important to make working prototypes
available for experimentation well before we reach the final proposal decision.
<a class=anchor href="#short"><h2 id="short">Short Experiments</h2></a>
<p>
We had already recognized the need for experimenting
before making a final decision,
which motivated the
<a href="https://blog.golang.org/go2-here-we-come">procedure we introduced</a>
for language changes starting in Go 1.13.
In short, that procedure is:
have an initial discussion about whether to move forward;
if so, check in the implementation at the start of the three-month development cycle;
have a final discussion at the end of the development cycle;
if the feature is not ready yet, remove it for the freeze
and the release; repeat if needed.
This three-month window worked reasonably well for
small features like signed shift counts.
For larger features, it is clear that three months is too short
and a different approach providing a longer window is needed.
We spent a while at the summit talking about possible
ways to make features available on a longer-term experimental basis,
and the various concerns that must be balanced.
<a class=anchor href="#long"><h2 id="long">Longer, Opt-In Experiments</h2></a>
<p>
For <code>vgo</code> the way to opt in to experimenting
was to download and run a separate command,
not the <code>go</code> command.
And <code>vgo</code> could “compile out” use of modules by preparing a <code>vendor</code> directory.
For <code>try</code>, Robert Griesemer wrote a simple converter, <code>tryhard</code>,
that looked for opportunities to add <code>try</code> expressions;
we intended to have a <code>tryhard</code> <code>-u</code> that removed them as well,
so that people who wanted to experiment with <code>try</code> could write code
using it and
“compile” that down to pure Go when publishing it.
A separate command is heavy-weight but has the significant
benefit of being independent of the underlying toolchain,
the same as non-experimental tools like <code>goyacc</code>
and <code>stringer</code>.
<p>
There is also a mechanism for experiments within the main toolchain.
The environment variable
<code>GOEXPERIMENT</code> can be set during the toolchain build
(that is, during <code>all.bash</code> or <code>make.bash</code>)
to enable unfinished or experimental features
in that toolchain.
This mechanism restricts the use of these features
to developers who build the Go toolchain itself from source,
which is not most users.
Indeed, <code>GOEXPERIMENT</code> is intended mainly for
use by the developers of those in-progress features,
typically invisible implementation details,
not semantic language changes.
(For example, use of register <code>RBP</code> to hold a frame pointer
on x86-64 was added as an experiment flag until we were
sure it was robust enough to enable by default.)
<p>
As a lighter-weight mechanism,
people at the contributor summit raised the idea of
opting in to an experimental feature
with a line in in <code>go.mod</code> or with a Python-like special import
(<code>import</code> <code>_</code> <code>"experimental/try"</code>).
<a class=anchor href="#restrict"><h2 id="restrict">Restricting Experiments</h2></a>
<p>
The biggest question about experimental
features is how to restrict them—that is, how to contain their impact—to
ensure the overall ecosystem does not
depend on them before they are stable.
On the one hand, you want to make it possible
for people to try the feature in real-world use cases,
meaning production uses.
So you want to be using an otherwise release-quality toolchain
where the only change is that the feature is added.
(Separate tools and the <code>GOEXPERIMENT</code> settings both make this possible.)
On the other hand, any production usage creates
a reliance on the feature that will translate
into breakage if the feature is changed or removed.
The ability to gather production experience with the feature
and the stability of the usual Go compatibility promise
are in direct conflict.
Even if users understand that there is no
compatibility for an experimental feature,
it still hurts them when their code breaks.
<p>
A critical aspect of containing experimental
features is thinking about how they interact with
dependencies.
Is it okay for a dependency to opt in to
using an experimental feature?
If the feature is removed, that might break
packages that didn't even realize they depended on it.
And what about tools?
Do editing environments or tools like <code>gopls</code> have to
understand the experimental feature as well?
It gets complicated fast.
<p>
We especially want to avoid breaking people who don't
even know they were using the feature.
Since the feature is experimental and <i>will</i> break,
that means trying to prevent a situation where
people are using it without knowing,
or where people are coerced into using it
by an important dependency.
Avoiding this problem is the main reason
that we have used heavier weight mechanisms
like separate tools or the <code>GOEXPERIMENT</code> flag
to limit the scope of experiments in Go.
<p>
At the contributor summit,
a few contributors with Rust experience said that a while back
many Rust crates simply required the use of
Rust's unstable toolchain, which has in-progress features enabled.
They also said that the situation has improved,
in part because of attention to the problem
and in part because some of the most important in-progress features
were completed and moved into the stable toolchain.
<p>
One problem we had in Go along similar lines was in
the introduction of experimental vendoring support in Go 1.5.
For that release, vendoring had to be enabled
using the <code>GO15VENDOREXPERIMENT=on</code>
environment variable. The Go 1.6 release changed
the default to be opt-out, and Go 1.7 removed the setting entirely.
But that meant projects with a significant number
of developers had to tell each developer to opt in
in order for the project to use it,
which made it harder to try and adopt than we realized.
Understanding this problem is one of the reasons that
modules have defaulted to an “automatic”
mode triggered by the presence of a <code>go.mod</code> file
for the past couple releases.
Although the <code>GO111MODULE</code> variable allows finer-grained control,
it can be ignored by most users.
Want to try modules? Create a <code>go.mod</code> (which you needed to do anyway).
<a class=anchor href="#todo"><h2 id="todo">What To Do</h2></a>
<p>
I don't see a <a href="http://worrydream.com/refs/Brooks-NoSilverBullet.pdf">silver bullet</a> here.
We will probably have to decide for each change what the
appropriate experiment mechanism is.
<p>
It is critically important to allow users to experiment with
proposed features, to better understand them and to
find problems early.
Significant changes should continue to be backed by prototypes.
<p>
On the other hand, it is equally (if not more) important
that experimental features not become unchangeable
de facto features due to dependency network effects.
A lightweight mechanism
(like <code>import</code> <code>_</code> <code>"experimental/try"</code>)
may be appropriate when a feature is near final
and we are willing to support the current semantics in all future toolchains.
Before then, such a mechanism is inappropriate:
all it takes is one important dependency to make the
feature impossible to change.
<p>
The most likely answer to the right way to experiment is “it depends.”
A trivial, well understood change like binary number literals
is probably fine to do using the <a href="#short">“short” release cycle</a> mechanism.
Larger changes like modules or <code>try</code> (or generics!)
probably need external tools or very careful use of
<code>GOEXPERIMENT</code> to avoid unwanted network effects.
<p>
Even so, we should be careful to remember to provide for
some experimentation mechanism for any non-trivial change,
well before that change leaves the <a href="proposals-discuss">design draft stage</a>
and becomes a formal proposal.
<a class=anchor href="#next"><h2 id="next">Next</h2></a>
<p>
Again, this is the fourth post <a href="proposals">in a series of posts</a>
thinking and brainstorming about the Go proposal process.
Everything about these posts is very rough.
The point of posting this series—thinking out loud
instead of thinking quietly—is so that
anyone who is interested can join the thinking.
<p>
I encourage feedback, whether in the form of comments on these posts,
comments on the newly filed issues,
mail to rsc@golang.org,
or your own blog posts (please leave links in the comments).
Thanks for taking the time to read these and think with me.
<p>
It's been a month since the last post, because
I was away for three weeks with my kids before school started
and then spent last week catching up.
I have at least two more posts planned.
<p>
The <a href="proposals-representation">next post</a> is about overall representation in the
proposal process and the Go language effort more broadly:
who is here, who is missing, and what can we do about it?
Go Proposal Process: Scaling Discussions
tag:research.swtch.com,2012:research.swtch.com/proposals-discuss
2019-08-22T10:00:00-04:00
2019-08-22T10:02:00-04:00
How can we scale large discussions? (Go Proposals, Part 4)
<p>
[<i>I’ve been thinking a lot recently about the
<a href="https://golang.org/s/proposal">Go proposal process</a>,
which is the way we propose, discuss, and decide
changes to Go itself.
Like <a href="https://blog.golang.org/experiment">nearly everything about Go</a>,
the proposal process is an experiment,
so it makes sense to reflect on what we’ve
learned and try to improve it.
This post is the fourth in <a href="https://research.swtch.com/proposals">a series of posts</a>
about what works well and,
more importantly,
what we might want to change.</i>]
<p>
The <a href="https://golang.org/s/proposal">current proposal process</a>
refers to “discussion on the issue tracker” and, in the case of a design doc, a “final discussion,”
but we’ve never written more about how to make those discussions effective.
A discussion of a small is typically a dozen messages or fewer; those are easy.
Discussion of <a href="proposals-large">large changes</a> is more difficult
and does not always work well.
<a class=anchor href="#one"><h2 id="one">Scaling One Discussion</h2></a>
<p>
As a forum for active discussion of large changes, the GitHub issue tracker has serious flaws.
As I write this, the “try” issue takes seven seconds to load,
and when it does it reports that there are 798 comments
but only displays the first 29, then a marker for “773 hidden items,”
and then the last 25.
Hiding most of the discussion makes it impossible to search the page,
which leads to people making points that have already been said.
(For comparison,
my <a href="https://swtch.com/try.html">outdated, manually curated summary page</a> loads and displays 611 comments in 300ms.
The problem is not the size of the raw data.)
Using a discussion forum that displayed the entire discussion
would be a step in the right direction.
After that, it would likely help to display a threaded tree of messages,
both to make clear what a reply is replying to
and also to allow skipping over subthreads that
are uninteresting for one reason or another.
After that, it would likely help to add comment ranking
that affects display order,
to help surface the most important comments.
One idea raised was to use <a href="https://golang.reddit.com">/r/golang</a>
or a new subreddit for the large proposal discussions,
and that seems worth considering further.
<p>
At the contributor summit, we asked that each discussion
have someone serve in the role of “facilitator.”
The facilitator tries to point out possible misunderstandings
as they happen, makes sure that a few people don’t dominate
the conversation, and tries to bring quiet people into the
conversation.
One point raised was whether it would make sense for
the online proposal discussion to have a clear facilitator
whose job is to keep a summary or decision document up-to-date
as well as to keep the discussion on topic and non-repeating
as much as possible.
(In the first few days of a particularly active discussion,
this might approach a full-time job.)
<p>
While better software and better process would help manage a large discussion,
though,
there is a point of diminishing returns,
and we shouldn’t focus on this one discussion
to the exclusion of what precedes it.
It is possible that instead of trying to scale one discussion
we should scale by having many discussions.
<a class=anchor href="#many"><h2 id="many">Scaling with Many Discussions</h2></a>
<p>
As I mentioned
in the <a href="proposals-large#process">large changes</a> post,
one problem with “try” was simply that
it should have been a second design draft, not a proposal.
Making it a proposal with a timeline
made everyone feel like they had to rush to comment
before a decision was made.
In the more relaxed setting of evaluating a design draft,
the more distributed
“<a href="https://golang.org/s/go2design">post a thoughtful experience report somewhere and link it to the wiki</a>”
scales much better and seems to be working well.
<p>
Requiring large changes to start with a series of design drafts
creates space for a variety of different conversations about
the designs, in different forums and media.
Any important points discovered in those conversations can be
reported back to the proposal issue and influence future drafts,
without having to put every comment on the proposal issue.
These various discussions would also help impacted users
get up to speed on the details of the proposal,
again without having every comment on the issue itself.
<p>
Overall, having many discussions would in turn reduce the
criticality of the issue tracker discussion itself and therefore the
demands the discussion places on the discussion forum,
whether that’s GitHub, Reddit, or something else.
(As one data point, the <a href="https://golang.org/issue/24301">Go modules issue</a>
received only 242 comments, compared to the current 798 for “try”,
quite possibly because there had already been so much
community discussion in other forums before the issue was filed.)
<a class=anchor href="#offline"><h2 id="offline">Scaling with Offline Discussions</h2></a>
<p>
Another fascinating idea raised at the contributor summit
is to make use of the Go Meetup network in the discussions
of the most important, largest changes, such as generics.
The idea would be to prepare materials to help meetup organizers
(or others) lead and facilitate discussions at each local meetup.
Then, crucially, we could gather summaries of feedback from
each meetup and possibly even iterate this process.
What I like most about this idea is that it engages a portion
of the user community (at least potentially) different from
“people who have the time and energy to keep up with GitHub,”
by taking the discussion to them.
(I plan to write a future post about representation more broadly.)
<a class=anchor href="#doc"><h2 id="doc">Scaling with Decision Documents</h2></a>
<p>
Another way to reduce the load placed on the GitHub issue
discussion would be to shift the focus to writing a
“decision document” laying out the various points raised
and presenting as fairly as possible
both sides of the decision to be made.
Then the discussion would serve primarily
to suggest additions or changes to the decision document.
This would have the important effect that someone who
looked away for a week or two could catch up not by
reading every message that arrived in the interim
but instead by looking at what had changed in the
much shorter document.
We already do this informally for very large issues
by trying to post summary comments occasionally;
formalizing this in a separate document might help
encourage people to start at the document instead of the discussion.
The decision document could also be a new section
in the proposal design document,
but perhaps a separate document would be
easier to point at.
I filed <a href="https://golang.org/issue/33791">issue 33791</a> to track this idea.
<a class=anchor href="#summary"><h2 id="summary">Summary</h2></a>
<p>
Overall, it seems clear we can do better at scaling and managing a single large discussion,
but I think it is equally clear that process adjustments
such as having many discussions or producing a decision document
could make the final discussion easier and shorter,
which would be a complementary, and possibly larger, win.
If I had to choose one aspect of discussions to focus on,
I think I would focus on addressing the underlying social problem
by taking steps to turn down the importance and heat of
the final discussion,
such as creating space for multiple discussions
and introducing a decision document as a way to focus the energy of the discussion,
instead of falling into the engineer’s fallacy of “let’s build a better discussion forum.”
<a class=anchor href="#next"><h2 id="next">Next</h2></a>
<p>
Again, this is the fourth post <a href="proposals">in a series of posts</a>
thinking and brainstorming about the Go proposal process.
Everything about these posts is very rough.
The point of posting this series—thinking out loud
instead of thinking quietly—is so that
anyone who is interested can join the thinking.
<p>
I encourage feedback, whether in the form of comments on these posts,
comments on the newly filed issues,
mail to rsc@golang.org,
or your own blog posts (please leave links in the comments).
Thanks for taking the time to read these and think with me.
<p>
The next post will be about how the process for proposing large changes
could be made smoother with some kind of official mechanism
for experiments and prototypes.
Go Proposal Process: Large Changes
tag:research.swtch.com,2012:research.swtch.com/proposals-large
2019-08-15T12:00:00-04:00
2019-08-15T12:02:00-04:00
How can we identify and handle large changes better? (Go Proposals, Part 3)
<p>
[<i>I’ve been thinking a lot recently about the
<a href="https://golang.org/s/proposal">Go proposal process</a>,
which is the way we propose, discuss, and decide
changes to Go itself.
Like <a href="https://blog.golang.org/experiment">nearly everything about Go</a>,
the proposal process is an experiment,
so it makes sense to reflect on what we’ve
learned and try to improve it.
This post is the third in <a href="https://research.swtch.com/proposals">a series of posts</a>
about what works well and,
more importantly,
what we might want to change.</i>]
<p>
The proposal process we have today recognizes two kinds of proposal:
trivial (no design doc) and non-trivial (design doc).
An initial proposal issue can be just a few lines describing the idea,
which is then discussed on the GitHub issue.
Most proposals exit the process after this discussion.
Typical reasons for declining a proposal at this stage include some combination of:
<ul>
<li>
The proposal is a duplicate of a previous proposal.
<li>
The proposal is not specific enough to evaluate.
For example, <a href="https://golang.org/issue/20142">issue 20142</a> suggests
to do something—but not something specific—about accidental <code>nil</code> dereferences.
<li>
The proposal is not <a href="https://golang.org/doc/go1compat">backwards compatible</a>.
For example, <a href="https://golang.org/issue/33454#issue-476523778">issue 33454</a>
proposed first to change the type signature of <a href="https://golang.org/pkg/log/#Logger.SetOutput"><code>log.(*Logger).SetOutput</code></a>
and then was revised to change only the semantics instead.
Both would be backwards-incompatible changes.
<li>
The proposal can already be implemented as a package outside the standard library.
For example, the problem solved by <a href="https://golang.org/issue/33454">issue 33454</a>—logging to multiple writers—is easily
solved with <a href="https://golang.org/pkg/io/#MultiWriter">`io.MultiWriter</a>.
<li>
The proposal does not address a common enough problem to merit a change or addition
to the language or standard library.
For example, <a href="https://golang.org/issue/26262">issue 26262</a> suggested
a new <code>%*T</code> print verb that would have little applicability and
is easily implemented as a one-line function if needed.
<li>
The proposal violates a core design principle or goal of the package.
For example, <a href="https://golang.org/issue/33449">issue 33449</a> suggested adding indirect template calls
to <a href="https://golang.org/pkg/text/template"><code>text/template</code></a>,
but that would invalidate the safety analysis in <a href="https://golang.org/pkg/html/template"><code>html/template</code></a>.</ul>
<p>
The typical reason for accepting a proposal at this stage is that
the details are simple and straightforward enough that they can
be stated crisply and clearly without a design doc, and there is
general agreement about moving forward.
The specifics can range from significant to trivial,
from
<a href="https://golang.org.issue/19069">issue 19069</a> (extend release support timeline)
to
<a href="https://golang.org/issue/18086">issue 18086</a> (add <a href="https://golang.org/pkg/encoding/json/#Valid"><code>json.Valid</code></a>)
to
<a href="https://golang.org/issue/20023">issue 20023</a> (document that <a href="https://golang.org/pkg/os/#NewFile"><code>os.NewFile</code></a> returns an error when passed a negative file descriptor).
<p>
Other proposals are intricate enough or have subtle enough implications
to merit <a href="https://twitter.com/andybons/status/1159573179775508480">writing a design document</a>,
which we <a href="https://golang.org/design/">save in the proposal design repo</a>.
For example, here are the design documents for
<a href="https://golang.org/design/12166-subtests">testing subtests</a>
<a href="https://golang.org/design/16339-alias-decls">alias declarations</a> (abandoned),
<a href="https://golang.org/design/18130-type-alias">type aliases</a>,
and
<a href="https://golang.org/design/19348-midstack-inlining">mid-stack inlining</a>.
<p>
The brief issue tracker discussion contemplated in the proposal process
works well for small changes, even those with subtle implications
requiring a design document.
But it breaks down for what we might think of as large changes.
Larger changes don't necessarily have longer design docs,
but they tend to have larger issue tracker discussions,
because more people are affected by the result.
Recent examples include the discussion of the
<a href="https://golang.org/issue/29934">Go 1.13 error value changes</a> (404 comments as of August 1)
and of course <a href="https://golang.org/issue/32437">the try builtin</a> (798 comments).
<a class=anchor href="#checklist"><h2 id="checklist">Large Checklist</h2></a>
<p>
One idea raised was to leave the process for small changes unaffected
but add more process for large changes.
That in turn requires identifying large changes.
One suggestion was to create a rubric based on
a checklist of how much of the Go ecosystem a change affects.
For example:
<ul>
<li>
Is the change user-visible at all?
<li>
Does it require any changes to any documentation?
<li>
Are its effects confined to a single package?
<li>
Does it require changes to the language spec?
<li>
Does it require users to change existing scripts or workflows?
<li>
Does it require updating introductory materials?
<li>
And so on.</ul>
<p>
If the score is high, the proposal could require more process or at least more care.
And if the proposal is accepted,
the checklist answers would help plan the rollout.
<a class=anchor href="#process"><h2 id="process">Large Process</h2></a>
<p>
One piece of added process for large changes
could be a pre-proposal stage,
when the design is simply a draft being explored,
not a change ready to be proposed and decided on.
We consciously avoided the word “proposal” last year
when we published the
<a href="https://golang.org/s/go2design">Go 2 draft designs last summer</a>,
to make clear that those designs were still in progress
and being proposed as-is.
That clear framing served everyone well,
and, in retrospect, “try” was a large enough
change that the new design we published last month
should have been a second draft design,
not a proposal with an implementation timeline.
(Contrast the urgency many felt during the “try” discussion
with the relative calm around last week's
second draft design for generics.)
<p>
Another piece of added process for large changes
could be a pre-proposal experimentation period of some kind
(more on that in a future post).
<p>
And when a large change does migrate out of draft design
to be formally proposed for acceptance,
another piece of added process could be a different
kind of discussion,
or more discussions
(more on that in a future post too).
<p>
I filed <a href="https://golang.org/issue/33670">issue 33670</a> to track the idea of identifying large changes
and adding more process for them.
<a class=anchor href="#next"><h2 id="next">Next</h2></a>
<p>
Again, this is the third post <a href="proposals">in a series of posts</a>
thinking and brainstorming about the Go proposal process.
Everything about these posts is very rough.
The point of posting this series—thinking out loud
instead of thinking quietly—is so that
anyone who is interested can join the thinking.
<p>
I encourage feedback, whether in the form of comments on these posts,
comments on the newly filed issues,
mail to rsc@golang.org,
or your own blog posts (please leave links in the comments).
Thanks for taking the time to read these and think with me.
<p>
The <a href="proposals-discuss">next post</a> is about how proposal discussions might be scaled
to handle large changes with many impacted users giving feedback.
Go Proposal Process: Clarity & Transparency
tag:research.swtch.com,2012:research.swtch.com/proposals-clarity
2019-08-07T14:00:00-04:00
2019-08-07T14:02:00-04:00
What would improve everyone’s understanding of the process? (Go Proposals, Part 2)
<p>
[<i>I’ve been thinking a lot recently about the
<a href="https://golang.org/s/proposal">Go proposal process</a>,
which is the way we propose, discuss, and decide
changes to Go itself.
Like <a href="https://blog.golang.org/experiment">nearly everything about Go</a>,
the proposal process is an experiment,
so it makes sense to reflect on what we’ve
learned and try to improve it.
This post is the second in <a href="https://research.swtch.com/proposals">a series of posts</a>
about what works well and,
more importantly,
what we might want to change.</i>]
<p>
In the discussions I’ve had with contributors recently,
the most important takeaway for me was that there needs to be
more clarity about the overall process for making decisions
and also more transparency, to make it easier to observe and follow along.
The introduction of the proposal process was an important step
toward those concerns, but more is needed.
<a class=anchor href="#documentation"><h2 id="documentation">Documentation</h2></a>
<p>
One of the explicit goals we listed in 2015
was to make the process clear to new contributors,
but many people are unaware that the proposal process is documented at all.
It is linked on the <a href="https://golang.org/doc/contribute.html">contributor guidelines</a>,
but many people who participate in the process
do not send code changes and thus have no reason to read that doc.
One idea raised was to have gopherbot comment with a link whenever the
<a href="https://github.com/golang/go/labels/Proposal">proposal label</a> is added to a GitHub issue.
<p>
The <a href="https://golang.org/s/proposal">proposal process description</a> is also missing some details.
It links to a few important talks from 2015
but does not explicitly say what the relevant
material is; not many people are going to watch
the entire talk to find out.
And in some discussions people
who have read the proposal process doc
have told us they didn’t even know the talks existed,
despite being linked in the doc.
Obviously those links are not working.
<p>
Also, we introduced a longer process,
with two rounds of review,
for Go 2 language changes,
but it is only <a href="https://blog.golang.org/go2-here-we-come">documented in a blog post</a>.
That blog post is linked from the proposal process doc,
but having everything on one page would reach more readers.
<p>
I filed <a href="https://golang.org/issue/33524">issue 33524</a>
to make sure we update the README to stand alone.
<a class=anchor href="#status"><h2 id="status">Issue Status</h2></a>
<p>
Once you know about the proposal process,
there is still not much clarity about the state of any particular proposal.
The only labels we have are Proposal and Proposal-Accepted.
(Proposals that are closed without the Proposal-Accepted label
are the ones that have been declined.)
We’ve already added
<a href="https://blog.golang.org/go2-here-we-come">two rounds of review for Go 2 changes</a>,
and we may want to formalize the idea of design drafts
for large changes, prior to the usual proposal process
(a topic for a future post).
As the process gets more formalized, it would help if there was a
clear answer to “where is this specific issue in the process?”
Other bug trackers allow defining custom metadata fields
that we could use to store this information.
GitHub being GitHub, there is no direct support for any of this.
<p>
A possible solution to both of these issues—not knowing about
the process and not knowing where the issue is in the process—would
be for gopherbot to insert and maintain a block of text at the top of the
issue description with a link to the process,
information about the state of the issue,
and any other information we might identify
that would help people arriving at the issue learn
where things stand.
I filed <a href="https://golang.org/issue/33522">issue 33522</a> for that idea.
<a class=anchor href="#minutes"><h2 id="minutes">Review Minutes</h2></a>
<p>
One of the explicit goals of the process
was to “commit to timely evaluations of proposals”
(quoting <a href="https://youtu.be/0ht89TxZZnk?t=1737">Andrew Gerrand’s talk</a>).
Initially, proposals piled up without timely evaluation.
In late 2015, Andrew started a regular meeting to help make sure
that core Go team members delivered on our promise
of timely evaluation.
The primary activity of this meeting is to make sure
proposals are moving along in the process, receiving feedback,
and not being forgotten.
<p>
We <a href="https://go.googlesource.com/proposal/+/c69968cf9f3547f276d07a78421bf153936238b2/README.md#proposal-review">documented the meetings explicitly</a> in 2018:<blockquote>
<p>
<b>Proposal Review</b>
<p>
A group of Go team members holds “proposal review meetings”
approximately weekly to review pending proposals.
<p>
The principal goal of the review meeting is to make sure that proposals
are receiving attention from the right people,
by cc’ing relevant developers, raising important questions,
pinging lapsed discussions, and generally trying to guide discussion
toward agreement about the outcome.
The discussion itself is expected to happen on the issue tracker,
so that anyone can take part.
<p>
The proposal review meetings also identify issues where
consensus has been reached and the process can be
advanced to the next step (by marking the proposal accepted
or declined or by asking for a design doc).</blockquote>
<p>
Even so, it has come up a few times in online discussions
and also at contributor day that people don’t understand
who the proposal review group is or what it does.
More transparency here would help as well.
<p>
At about the same time as we documented the meetings
(at least as best I remember),
I created a “team” named @golang/proposal-review
on GitHub to try to make it clear who was in the meetings.
Unfortunately, I didn’t know at the time that GitHub never
allows non-members of an organization to view team membership lists,
even when the group is “public.”
So while nearly all Go project contributors can see the list,
everyone else cannot.
For the record, today that group is
Andy Bonventre, Brad Fitzpatrick, Robert Griesemer,
Ian Lance Taylor, Rob Pike, Steve Francia, and me,
although not everyone attends every meeting.
<p>
One suggestion made at contributor day
was to publish minutes of the review meetings.
All the actions we take are visible on the GitHub issues themselves,
but there was no easy way to aggregate them
and see the review work as a whole.
<p>
As of yesterday’s meeting,
we have started collecting minutes in
<a href="https://golang.org/issue/33502">issue 33502</a>.
<p>
Keeping the minutes in an issue enables easy cross-referencing
in both directions between the issues and the minutes;
seeing the links to the minutes appear in the proposal issues
should help with understanding the process.
The minutes also record who participated in each meeting,
so that membership is clear even without visibility of
the GitHub team, which we could probably now delete.
<p>
Writing those minutes yesterday forced us to be a bit
more careful about explaining the reasons why we did
things, which should be helpful in making the process
clearer and will likely result in clarifications in the
proposal process document itself, once we have a few
more meetings with minutes under our belts.
<a class=anchor href="#decisions"><h2 id="decisions">Decisions</h2></a>
<p>
The goal of the proposal discussions is to reach clear agreement
on whether to accept or decline a proposal.
What happens when there is not clear agreement?
<p>
The <a href="https://go.googlesource.com/proposal/+/01029b917fbbfcf1cbf53df42c8c3f48da9ffe7d/README.md#process">original proposal process document (2015)</a> said:<blockquote>
<p>
In Go development historically, a lack of agreement means decline.
If there is disagreement about whether there is agreement,
adg@ is the arbiter.</blockquote>
<p>
In 2016, we realized that the Go user community was large enough
that most important decisions would not reach complete agreement.
After discussion on <a href="https://golang.org/issue/17129">issue 17129</a>,
we <a href="https://go.googlesource.com/proposal/+/8300d7937d4bd7401aec456a354580421d5d3e98%5E%21/#F0">updated the doc</a> to explain what happens in that case.
At the same time, adg@ moved on to <a href="https://upspin.io">other work</a> and <a href="https://go.googlesource.com/proposal/+/dbc2ccebb4cfa2e6b566645aaea14389e1a351d7%5E%21/#F0">I took on the arbiter role</a>. The document now said:<blockquote>
<p>
The goal of the final discussion is to reach agreement on the next step: (1) accept or (2) decline.
The discussion is expected to be resolved in a timely manner.
If clear agreement cannot be reached, the arbiter (rsc@) reviews the discussion
and makes the decision to accept or decline.</blockquote>
<p>
This version made it seem like, in the absence of clear agreement,
the arbiter could make up any answer,
which of course is not the case.
<p>
In the 2018 revisions that documented the review meetings, I <a href="https://go.googlesource.com/proposal/+/c69968cf9f3547f276d07a78421bf153936238b2/README.md#consensus-and-disagreement">tried to add more details that process</a>,
but in what turned out to be a differently misleading way,
unfortunately conflating proposal review with arbitration of proposal decisions:<blockquote>
<p>
<b>Consensus and Disagreement</b>
<p>
The goal of the proposal process is to reach general consensus about the outcome
in a timely manner.
<p>
If general consensus cannot be reached,
the proposal review group decides the next step
by reviewing and discussing the issue and
reaching a consensus among themselves.
If even consensus among the proposal review group
cannot be reached (which would be exceedingly unusual),
the arbiter (<a href="mailto:rsc@golang.org">rsc@</a>)
reviews the discussion and
decides the next step.</blockquote>
<p>
In retrospect, this conflation of proposal review and
contended proposal decisions has been an unhelpful source
of confusion about the weekly proposal review meetings,
which are nearly entirely concerned with the review/triage/gardening
described in the previous section.
In contrast, the handling of contended decisions happens very rarely,
maybe once a year.
These two activities—review and deciding contended proposals—have historically
been done by the same people, but that is not a fundamental requirement.
It probably makes sense to separate these two activities
so that they can be handled by different groups.
I filed <a href="https://golang.org/issue/33528">issue 33528</a> for this.
<a class=anchor href="#next"><h2 id="next">Next</h2></a>
<p>
Again, this is the second post <a href="proposals">in a series of posts</a>
thinking and brainstorming about the Go proposal process.
Everything about these posts is very rough.
The point of posting this series—thinking out loud
instead of thinking quietly—is so that
anyone who is interested can join the thinking.
<p>
I encourage feedback, whether in the form of comments on these posts,
comments on the newly filed issues,
mail to rsc@golang.org,
or your own blog posts (please leave links in the comments).
Thanks for taking the time to read these and think with me.
<p>
The <a href="proposals-large">next post</a> is about how the proposal process
should scale down to tiny changes and up to large ones.
Thinking about the Go Proposal Process
tag:research.swtch.com,2012:research.swtch.com/proposals-intro
2019-08-05T12:02:00-04:00
2019-08-05T12:04:00-04:00
What works about the Go proposal process? What doesn’t? (Go Proposals, Part 1)
<p>
I’ve been thinking a lot recently about the
<a href="https://golang.org/s/proposal">Go proposal process</a>,
which is the way we propose, discuss, and decide
changes to Go itself.
Like <a href="https://blog.golang.org/experiment">nearly everything about Go</a>,
the proposal process is an experiment,
so it makes sense to reflect on what we’ve
learned and try to improve it.
This post is the first in <a href="https://research.swtch.com/proposals">a series of posts</a>
about what works well and,
more importantly,
what we might want to change.
<p>
This post is focused on where we are now and how we got here.
Later posts will focus on specific areas where we might improve.
At our annual contributor summit held at Gophercon last week,
about thirty or so contributors from outside Google attended
and helped us on the Go team think through many of these areas.
Their suggestions figure prominently in the posts to follow,
as, I hope, will yours.
<p>
(If you are curious what the contributor summit is,
here’s
<a href="https://blog.golang.org/contributors-summit">Sam Whited’s recap of our first such summit in 2017</a>.
This year’s was pretty similar in spirit, although with different discussion topics.
This year there were about 60 people in the room,
roughly evenly split between members of the Go team at Google
and contributors from outside Google.)
<a class=anchor href="#proposal_process"><h2 id="proposal_process">Proposal Process</h2></a>
<p>
When we started Go, we launched with instructions on day one
detailing how to send us code changes,
which are of course the core of any open source project.
Around five years ago we noticed that despite the many
code contributions,
most change proposals
were made by the Go team at Google,
even when motivated and driven by feedback from the broader Go user community.
One reason, we realized, was that the process for proposing changes
was nearly completely undocumented.
To try to address this, we introduced a formal change proposal process in 2015,
now documented at <a href="https://golang.org/s/proposal">golang.org/s/proposal</a>.
For more background,
see <a href="https://youtu.be/0ht89TxZZnk?t=1637">Andrew Gerrand’s 2015 GopherCon talk, starting at 27m17s</a>
(only a few minutes).
In that talk, Andrew said, “It’s important to note that this process is an experiment.
We’re still kind of discussing exactly how that process should work.”
<p>
I talked at GopherCon this year about the
<a href="https://blog.golang.org/experiment">experiment, simplify, ship</a> cycle we use for just about everything.
Like with other parts of Go,
we’ve learned from our experiments using the proposal process
and made adjustments in the past,
and of course we intend to keep doing that.
<p>
The current proposal process is documented as four steps:
<ol>
<li>
The author creates a brief issue clearly describing the proposal. (No need for a design document just yet.)
<li>
Discussion on the GitHub issue aims to triage the proposal into one of three buckets: accept; decline; or ask for a detailed design doc addressing an identified list of concerns.
<li>
If the previous step ended at accept/decline, we’re done. Otherwise, the author writes a design doc, which is discussed on the GitHub issue.
<li>
Once comments and design doc revisions wind down, a final discussion aims to reach a final accept or decline decision.</ol>
<p>
See the <a href="https://golang.org/s/proposal">full document</a> for details.
<a class=anchor href="#random_sample"><h2 id="random_sample">A Random Sample</h2></a>
<p>
As I write this, there are 1633 GitHub issues labeled Proposal.
Of the 1187 that have been closed, 170 were accepted, about 14%.
<p>
To get a sense of the submissions, discussions, and outcomes,
here are twenty selected at random (by a Perl script).
<ul>
<li>
<a href="https://golang.org/issue/11502">#11502 A security response policy for Go</a> (39 comments, accepted)
<li>
<a href="https://golang.org/issue/14991">#14991 add a builtin splice function for easier slice handling</a> (7 comments)
<li>
<a href="https://golang.org/issue/16844">#16844 freeze net/rpc</a> (20 comments, accepted)
<li>
<a href="https://golang.org/issue/17672">#17672 remote runtime</a> (9 comments)
<li>
<a href="https://golang.org/issue/18303">#18303 flag.failf should return an error with Cause() error</a> (15 comments)
<li>
<a href="https://golang.org/issue/18662">#18662 Struct field tags for directional Marshal/Unmarshal</a> (6 comments)
<li>
<a href="https://golang.org/issue/21360">#21360 add a build tag "test"</a> (14 comments)
<li>
<a href="https://golang.org/issue/21592">#21592 add an in-memory writer/seeker to io package</a> (9 comments)
<li>
<a href="https://golang.org/issue/22247">#22247 Add sync.Map.Len() method</a> (5 comments)
<li>
<a href="https://golang.org/issue/22918">#22918 go/doc: consts/vars should be grouped with types by their computed type</a> (31 comments)
<li>
<a href="https://golang.org/issue/23331">#23331 encoding/json: export the offset method of the Decoder</a> (7 comments)
<li>
<a href="https://golang.org/issue/23789">#23789 add uuid generator to stdlib</a> (13 comments)
<li>
<a href="https://golang.org/issue/24410">#24410 Add some way to explore packages and structs inside.</a> (6 comments)
<li>
<a href="https://golang.org/issue/25273">#25273 add token to syntax to reduce error boilerplate</a> (2 comments)
<li>
<a href="https://golang.org/issue/25518">#25518 x/vgo: allow aliases in go.mod</a> (10 comments)
<li>
<a href="https://golang.org/issue/25670">#25670 don’t include cgo with net on Unix by default</a> (7 comments)
<li>
<a href="https://golang.org/issue/26803">#26803 mime/multipart: add (*Reader).NextRawPart to avoid quoted-printable decoding</a> (9 comments, accepted)
<li>
<a href="https://golang.org/issue/26822">#26822 flag: clean up error message</a> (20 comments, accepted)
<li>
<a href="https://golang.org/issue/30886">#30886 cmd/go: allow replacing a subdirectory within a package</a> (4 comments)
<li>
<a href="https://golang.org/issue/31041">#31041 package and file organisation</a> (7 comments)</ul>
<p>
One is significant.
Most are small.
A few are not well-defined.
Most had fewer than ten comments.
This is typical.
<a class=anchor href="#process_evolution"><h2 id="process_evolution">Process Evolution</h2></a>
<p>
For many people, the proposal process represents not those small changes and suggestions
in the random sample but instead larger proposed changes, like type aliases (2016),
monotonic time (2017), Go modules (2018), new number literals (2019),
and the abandoned “try” proposal (2019).
<p>
These large changes are of course important,
and in each of these we’ve learned a bit more
about what works and what doesn’t for making
successful changes.
<p>
The discussion of the original aliases proposal
helped us understand the importance of
motivating changes, like my
<a href="https://talks.golang.org/2016/refactor.article">codebase refactoring talk and article</a>
motivated type aliases.
During that experience,
someone introduced me to
Rust’s “<a href="http://aturon.github.io/tech/2018/05/25/listening-part-1/">no new rationale</a>” rule,
which we have tried to follow when making difficult decisions since then.
I reflected more about motivation for changes,
using both aliases and monotonic time as examples,
in <a href="https://blog.golang.org/toward-go2">my GopherCon 2017 talk kicking off Go 2</a>.
<p>
Although there were parts of the Go modules proposal
that did not go well,
the general approach of spending significant time
discussing the ideas before starting the formal proposal process
did seem to help:
at the time that we accepted the modules proposal,
the <a href="https://github.com/golang/go/issues/24301">GitHub conversation and reactions</a>
were overwhelmingly in favor.
<p>
One thing we learned from aliases and then from modules
was the importance of having an implementation people can try
and also the importance of having the changes ready
to be committed at the start of a development cycle.
We <a href="https://blog.golang.org/go2-here-we-come">adopted this idea explicitly</a>
for Go 2 language changes,
and it was successful for the smaller Go 1.13 changes
like the <a href="https://tip.golang.org/doc/go1.13#language">new number literal syntax</a>.
<p>
For the <a href="https://golang.org/issue/32437">recent “try” proposal</a>,
we followed the evolving process,
including making changes available for people to use early in a cycle,
and the discussion and reactions were much more heated than we expected.
We abandoned the proposal only a week before the summit,
and one thing was eager to discuss with contributors was
what had been different about “try” and how to continue to
improve the process to make future changes smoother.
(If the discussion about “try” was difficult,
what will happen when we discuss <a href="https://blog.golang.org/why-generics">generics?</a>)
<a class=anchor href="#improvement_areas"><h2 id="improvement_areas">Improvement Areas</h2></a>
<p>
The discussions we had over six hours or so at the contributor summit
surfaced at least six different areas where we might improve
the proposal process specifically and the Go project’s community engagement
more generally.
I plan to write a post about each of these themes,
but I’ll summarize them briefly here too.
I’d be happy to hear suggestions for any other important areas
that I’ve missed.
<p>
<b>Clarity & Transparency</b>.
Adding clarity and transparency about how we make changes to Go—making that process easier
to follow and to participate in—was the original motivation for creating the proposal process.
There’s more we could do, including publishing a record of
proposal decisions.
(Most of the proposal review group’s time is spent not on decisions
but on adding people to issues, pinging requests for more information, and so on.)
<i>Update</i>: See the “<a href="proposals-clarity">Clarity & Transparency</a>” post for more thoughts.
<p>
<b>Scaling the Process</b>.
The proposal process is meant to lightweight enough to apply to very small changes,
such as the recently accepted
<a href="https://golang.org/issue/32420">proposal to add a <code>SubexpIndex</code> method to <code>regexp.Regexp</code></a>.
As the proposed change gets bigger,
it may make sense to introduce additional process.
For example, we were careful at Gophercon 2018
to publish our thoughts about
error handling and generics as
“design drafts” not proposals.
For a large enough change, perhaps publishing iterating on
design drafts should formally be the first step of the process.
<i>Update</i>: See the “<a href="proposals-size">Sizing Changes</a>” post for more thoughts.
<p>
<b>Scaling Discussions</b>.
GitHub’s issue tracker is not particularly effective at large discussions.
For large changes, we may want to investigate alternatives,
and we certainly want to make sure to have more discussion
long before it is time to make any decisions.
For example, publishing multiple design drafts,
giving talks, and publishing articles are all ways to
engage helpful discussion before reaching the point in
the proposal process where decisions are being made.
<p>
<b>Prototypes & Experiments</b>.
For most non-trivial changes it is helpful to understand
them by trying them out before making a decision.
We do this as a matter of course for small changes:
we always have at least a few months between when
a change is made and the corresponding release,
during which we can reconsider, adjust, or remove it.
We arrange to
<a href="https://blog.golang.org/go2-here-we-come">land language changes on day 1</a>
of a development cycle to maximize that window.
But for large changes we probably need a way to make
prototypes available separately, to give even more time,
a bit like the <code>vgo</code> prototype for Go modules.
<p>
<b>Community Representation</b>.
Andrew said in 2015 that he hoped
the proposal process would
“make the process more accessible to anybody
who really wants to get involved in the design of Go.”
We definitely get many more proposals from
outside the Go team now than we did in 2015,
so in that sense it has succeeded.
On the other hand, we believe there are
<a href="https://research.swtch.com/gophercount">over a million Go programmers</a>,
but only 2300 different GitHub accounts have commented on
proposal issues, or a quarter of one percent of users.
If this were a random sample of our users, that might be fine,
but we know the participation is skewed to
English-speaking GitHub users
who can take the time to keep up with the Go issue tracker.
To make the best possible decisions we need to
gather input from more sources,
from a broader cross-section of the population of the Go community,
by which I mean all Go users.
(On a related note, anyone who describes “the Go community”
as having a clear opinion about anything must have in mind a
much narrower definition of that group:
a million or more people can’t be painted with a single brush.)
<p>
<b>Community Coordination</b>.
We have had mixed results attempting to engage
the broader Go community in the work of developing Go.
The clearest success is the technical development of
the Go source code itself.
Today, I count exactly 2,000 email addresses
in the Go CONTRIBUTORS file,
and only 310 from google.com or golang.org.
The next biggest success is probably the proposal process itself:
I estimate that the Go team accounts for about 15% of proposals overall
and about 30% of accepted proposals.
We also created a few working groups, most notably the
<a href="https://groups.google.com/d/msg/go-package-management/P8TehVoFLjg/Ni6VRyOjEAAJ">package management committee</a> in 2016
and the
<a href="https://blog.golang.org/developer-experience">developer experience</a> and
<a href="https://blog.golang.org/community-outreach-working-group">community outreach</a>
working groups in 2017.
Each one had aspects that worked well and aspects that didn’t.
More recently, the
<a href="https://golang.org/wiki/golang-tools">golang-tools</a> group
started in 2018 is coming up on its first birthday
and seems to be operating well.
We should try to learn from the successful and unsuccessful
aspects of all these groups and try to
create new, successful, sustainable groups.
<a class=anchor href="#next"><h2 id="next">Next</h2></a>
<p>
I plan to post about a new theme every day or two,
starting with the ones in the previous section,
until I run out of interesting thoughts.
<p>
Please remember as you read these posts that the goal here is
thinking, brainstorming, looking for good ideas.
There are almost certainly bad ideas in these posts too.
Don’t assume that everything I mention will happen,
especially not in the exact form described.
Everything about these posts is very rough.
The point of posting this series—thinking out loud
instead of thinking quietly—is so that
anyone who is interested can join the thinking.
<p>
I encourage feedback, whether in the form of comments on the posts,
mail to rsc@golang.org,
or your own blog posts (please leave links in the comments).
Thanks for taking the time to read these and think with me.
Go Proposals
tag:research.swtch.com,2012:research.swtch.com/proposals
2019-08-05T12:01:00-04:00
2019-08-05T12:03:00-04:00
Topic Index
<p>
These are the posts in the “Go Proposals” series that began in August 2019:
<ul>
<li>
“<a href="proposals-intro">Thinking about the Go Proposal Process</a>” [<a href="proposals-intro.pdf">PDF</a>].
<li>
“<a href="proposals-clarity">Go Proposal Process: Clarity & Transparency</a>” [<a href="proposals-clarity.pdf">PDF</a>].
<li>
“<a href="proposals-large">Large Changes</a>” [<a href="proposals-large.pdf">PDF</a>].
<li>
“<a href="proposals-discuss">Scaling Discussions</a>” [<a href="proposals-discuss.pdf">PDF</a>].
<li>
“<a href="proposals-experiment">Enabling Experiments</a>” [<a href="proposals-experiment.pdf">PDF</a>].
<li>
“<a href="proposals-representation">Representation</a>” [<a href="proposals-representation.pdf">PDF</a>].</ul>
<p>
These are Go issues filed to track (or in one case implement) ideas discussed in the series:
<ul>
<li>
<a href="https://golang.org/issue/33502">#33502 <b>proposal: meeting minutes</b></a> <br>
This issue collects the minutes of each proposal review meeting.
Star the issue to follow along with what happens each week.
<li>
<a href="https://golang.org/issue/33522">#33522 <b>proposal: gopherbot: add proposal process status box to each proposal issue</b></a> <br>
Discussed in “<a href="proposals-clarity">Clarity & Transparency</a>.”
<li>
<a href="https://golang.org/issue/33524">#33524 <b>proposal: update proposal/README.md to stand alone</b></a> <br>
Discussed in “<a href="proposals-clarity">Clarity & Transparency</a>.”
<li>
<a href="https://golang.org/issue/33528">#33528 <b>proposal: separate proposal review from contended decision-making</b></a> <br>
Discussed in “<a href="proposals-clarity">Clarity & Transparency</a>.”
<li>
<a href="https://golang.org/issue/33670">#33670 <b>proposal: identify large changes & add more process</b></a> <br>
Discussed in “<a href="proposals-large">Large Changes</a>.”
<li>
<a href="https://golang.org/issue/33791">#33791 <b>proposal: add explicit decision doc for large changes</b></a> <br>
Discussed in “<a href="proposals-discuss">Scaling Discussions</a>.”</ul>
Transparent Logs for Skeptical Clients
tag:research.swtch.com,2012:research.swtch.com/tlog
2019-03-01T11:00:00-05:00
2019-03-01T11:02:00-05:00
How an untrusted server can publish a verifiably append-only log.
<p>
Suppose we want to maintain and publish a public, append-only log of data.
Suppose also that clients are skeptical about our correct implementation
and operation of the log:
it might be to our advantage to leave things out of the log,
or to enter something in the log today and then remove it tomorrow.
How can we convince the client we are behaving?
<p>
This post is about an elegant data structure we can use to publish
a log of <i>N</i> records with these three properties:
<ol>
<li>
For any specific record <i>R</i> in a log of length <i>N</i>,
we can construct a proof of length
<i>O</i>(lg <i>N</i>) allowing the client to verify that <i>R</i> is in the log.
<li>
For any earlier log observed and remembered by the client,
we can construct a proof of length
<i>O</i>(lg <i>N</i>) allowing the client to verify that the earlier log
is a prefix of the current log.
<li>
An auditor can efficiently iterate over the records in the log.</ol>
<p>
(In this post, “lg <i>N</i>” denotes the base-2 logarithm of <i>N</i>,
reserving the word “log” to mean only “a sequence of records.”)
<p>
The
<a href="https://www.certificate-transparency.org/">Certificate Transparency</a>
project publishes TLS certificates in this kind of log.
Google Chrome uses property (1) to verify that
an <a href="https://en.wikipedia.org/wiki/Extended_Validation_Certificate">enhanced validation certificate</a>
is recorded in a known log before accepting the certificate.
Property (2) ensures that an accepted certificate cannot later disappear from the log undetected.
Property (3) allows an auditor to scan the entire certificate log
at any later time to detect misissued or stolen certificates.
All this happens without blindly trusting that
the log itself is operating correctly.
Instead, the clients of the log—Chrome and any auditors—verify
correct operation of the log as part of accessing it.
<p>
This post explains the design and implementation
of this verifiably tamper-evident log,
also called a <i>transparent log</i>.
To start, we need some cryptographic building blocks.
<a class=anchor href="#cryptographic_hashes_authentication_and_commitments"><h2 id="cryptographic_hashes_authentication_and_commitments">Cryptographic Hashes, Authentication, and Commitments</h2></a>
<p>
A <i>cryptographic hash function</i> is a deterministic
function H that maps an arbitrary-size message <i>M</i>
to a small fixed-size output H(<i>M</i>),
with the property that it is infeasible in practice to produce
any pair of distinct messages <i>M<sub>1</sub></i> ≠ <i>M<sub>2</sub></i> with
identical hashes H(<i>M<sub>1</sub></i>) = H(<i>M<sub>2</sub></i>).
Of course, what is feasible in practice changes.
In 1995, SHA-1 was a reasonable cryptographic hash function.
In 2017, SHA-1 became a <i>broken</i> cryptographic hash function,
when researchers identified and demonstrated
a <a href="https://shattered.io/">practical way to generate colliding messages</a>.
Today, SHA-256 is believed to be a reasonable cryptographic hash function.
Eventually it too will be broken.
<p>
A (non-broken) cryptographic hash function provides
a way to bootstrap a small amount of trusted data into
a much larger amount of data.
Suppose I want to share a very large file with you,
but I am concerned that the data may not arrive intact,
whether due to random corruption or a
<a href="TODO">man-in-the-middle attack</a>.
I can meet you in person and hand you,
written on a piece of paper,
the SHA-256 hash of the file.
Then, no matter what unreliable path the bits take,
you can check whether you got the right ones by
recomputing the SHA-256 hash of the download.
If it matches, then you can be certain,
assuming SHA-256 has not been broken,
that you downloaded the exact bits I intended.
The SHA-256 hash <i>authenticates</i>—that is,
it proves the authenticity of—the downloaded bits,
even though it is only 256 bits and the download
is far larger.
<p>
We can also turn the scenario around,
so that, instead of distrusting the network,
you distrust me.
If I tell you the SHA-256 of a file I promise to send,
the SHA-256 serves as a verifiable <i>commitment</i> to a particular sequence of bits.
I cannot later send a different bit sequence and convince you
it is the file I promised.
<p>
A single hash can be an authentication or commitment
of an arbitrarily large amount of data,
but verification then requires hashing the entire data set.
To allow selective verification of subsets of the data,
we can use not just a single hash
but instead a balanced binary tree of hashes,
known as a Merkle tree.
<a class=anchor href="#merkle_trees"><h2 id="merkle_trees">Merkle Trees</h2></a>
<p>
A Merkle tree is constructed from <i>N</i> records,
where <i>N</i> is a power of two.
First, each record is hashed independently, producing <i>N</i> hashes.
Then pairs of hashes are themselves hashed,
producing <i>N</i>/2 new hashes.
Then pairs of those hashes are hashed,
to produce <i>N</i>/4 hashes,
and so on, until a single hash remains.
This diagram shows the Merkle tree of size <i>N</i> = 16:
<p>
<img name="tlog-16" class="center pad" width=518 height=193 src="tlog-16.png" srcset="tlog-16.png 1x, tlog-16@1.5x.png 1.5x, tlog-16@2x.png 2x, tlog-16@3x.png 3x, tlog-16@4x.png 4x">
<p>
The boxes across the bottom represent the 16 records.
Each number in the tree denotes a single hash,
with inputs connected by downward lines.
We can refer to any hash by its coordinates:
level <i>L</i> hash number <i>K</i>, which we will abbreviate h(<i>L</i>, <i>K</i>).
At level 0, each hash’s input is a single record;
at higher levels, each hash’s input is a pair of hashes from the level below.<blockquote>
<p>
h(0, <i>K</i>) = H(record <i>K</i>)<br>
h(<i>L</i>+1, <i>K</i>) = H(h(<i>L</i>, 2 <i>K</i>), h(<i>L</i>, 2 <i>K</i>+1))</blockquote>
<p>
To prove that a particular record is contained in the tree
represented by a given top-level hash
(that is, to allow the client to authenticate a record, or verify a prior
commitment, or both),
it suffices to provide the hashes needed to recompute
the overall top-level hash from the record’s hash.
For example, suppose we want to prove that a certain bit string <i>B</i>
is in fact record 9 in a tree of 16 records with top-level hash <i>T</i>.
We can provide those bits along with the other hash inputs
needed to reconstruct the overall tree hash using those bits.
Specifically, the client can derive as well as we can that:<blockquote>
<p>
T = h(4, 0)<br>
= H(h(3, 0), h(3, 1))<br>
= H(h(3, 0), H(h(2, 2), h(2, 3)))<br>
= H(h(3, 0), H(H(h(1, 4), h(1, 5)), h(2, 3)))<br>
= H(h(3, 0), H(H(H(h(0, 8), h(0, 9)), h(1, 5)), h(2, 3)))<br>
= H(h(3, 0), H(H(H(h(0, 8), H(record 9)), h(1, 5)), h(2, 3)))<br>
= H(h(3, 0), H(H(H(h(0, 8), H(<i>B</i>)), h(1, 5)), h(2, 3)))</blockquote>
<p>
If we give the client the values [h(3, 0), h(0, 8), h(1, 5), h(2, 3)],
the client can calculate H(<i>B</i>) and then combine all those hashes
using the formula and check whether the result matches <i>T</i>.
If so, the client can be cryptographically certain
that <i>B</i> really is record 9 in the tree with top-level hash <i>T</i>.
In effect, proving that <i>B</i> is a record in the Merkle tree with hash <i>T</i>
is done by giving a verifiable computation of <i>T</i> with H(<i>B</i>) as an input.
<p>
Graphically, the proof consists of the sibling hashes (circled in blue)
of nodes along the path (highlighted in yellow)
from the record being proved up to the tree root.
<p>
<img name="tlog-r9-16" class="center pad" width=518 height=202 src="tlog-r9-16.png" srcset="tlog-r9-16.png 1x, tlog-r9-16@1.5x.png 1.5x, tlog-r9-16@2x.png 2x, tlog-r9-16@3x.png 3x, tlog-r9-16@4x.png 4x">
<p>
In general, the proof that a given record is contained
in the tree requires lg <i>N</i> hashes, one for each level
below the root.
<p>
Building our log as a sequence of records
hashed in a Merkle tree would give us a way to write
an efficient (lg <i>N</i>-length) proof that a particular record
is in the log.
But there are two related problems to solve:
our log needs to be defined for any length <i>N</i>,
not just powers of two,
and we need to be able to write
an efficient proof that one log is a prefix of another.
<a class=anchor href="#merkle_tree-structured_log"><h2 id="merkle_tree-structured_log">A Merkle Tree-Structured Log</h2></a>
<p>
To generalize the Merkle tree
to non-power-of-two sizes, we can
write <i>N</i> as a sum of decreasing powers of two,
then build complete Merkle trees of those sizes
for successive sections of the input,
and finally hash the at-most-lg <i>N</i> complete trees together
to produce a single top-level hash.
For example, 13 = 8 + 4 + 1:
<p>
<img name="tlog-13" class="center pad" width=434 height=193 src="tlog-13.png" srcset="tlog-13.png 1x, tlog-13@1.5x.png 1.5x, tlog-13@2x.png 2x, tlog-13@3x.png 3x, tlog-13@4x.png 4x">
<p>
The new hashes marked “x” combine the complete trees,
building up from right to left, to produce the overall tree hash.
Note that these hashes necessarily combine trees of
different sizes and therefore hashes
from different levels;
for example, h(3, x) = H(h(2, 2), h(0, 12)).
<p>
The proof strategy for complete Merkle trees applies
equally well to these incomplete trees.
For example, the proof that
record 9 is in the tree of size 13
is [h(3, 0), h(0, 8), h(1, 5), h(0, 12)]:
<p>
<img name="tlog-r9-13" class="center pad" width=437 height=202 src="tlog-r9-13.png" srcset="tlog-r9-13.png 1x, tlog-r9-13@1.5x.png 1.5x, tlog-r9-13@2x.png 2x, tlog-r9-13@3x.png 3x, tlog-r9-13@4x.png 4x">
<p>
Note that h(0, 12) is included in the proof because
it is the sibling of h(2, 2) in the computation of h(3, x).
<p>
We still need to be able to write an efficient proof
that the log of size <i>N</i> with tree hash <i>T</i>
is a prefix of the log of size <i>N</i>′ (> <i>N</i>) with tree hash <i>T</i>′.
Earlier, proving that <i>B</i> is a record in the Merkle tree with hash <i>T</i>
was done by giving a verifiable computation of <i>T</i> using H(<i>B</i>) as an input.
To prove that the log with tree hash <i>T</i>
is included in the log with tree hash <i>T</i>′,
we can follow the same idea:
give verifiable computations of <i>T</i> and <i>T</i>′,
in which all the inputs to the computation of <i>T</i>
are also inputs to the computation of <i>T</i>′.
For example, consider the trees of size 7 and 13:
<p>
<img name="tlog-o7-13" class="center pad" width=437 height=193 src="tlog-o7-13.png" srcset="tlog-o7-13.png 1x, tlog-o7-13@1.5x.png 1.5x, tlog-o7-13@2x.png 2x, tlog-o7-13@3x.png 3x, tlog-o7-13@4x.png 4x">
<p>
In the diagram, the “x” nodes complete the tree of size 13 with hash <i>T</i><sub>1</sub><sub>3</sub>,
while the “y” nodes complete the tree of size 7 with hash <i>T</i><sub>7</sub>.
To prove that <i>T</i><sub>7</sub>’s leaves are included in <i>T</i><sub>1</sub><sub>3</sub>,
we first give the computation of <i>T</i><sub>7</sub> in terms of complete subtrees
(circled in blue):<blockquote>
<p>
<i>T</i><sub>7</sub> = H(h(2, 0), H(h(1, 2), h(0, 6)))</blockquote>
<p>
Then we give the computation of <i>T</i><sub>1</sub><sub>3</sub>,
expanding hashes as needed to expose
the same subtrees.
Doing so exposes sibling subtrees (circled in red):<blockquote>
<p>
<i>T</i><sub>1</sub><sub>3</sub> = H(h(3, 0), H(h(2, 2), h(0, 12)))<br>
= H(H(h(2, 0), h(2, 1)), H(h(2, 2), h(0, 12)))<br>
= H(H(h(2, 0), H(h(1, 2), h(1, 3))), H(h(2, 2), h(0, 12)))<br>
= H(H(h(2, 0), H(h(1, 2), H(h(0, 6), h(0, 7)))), H(h(2, 2), h(0, 12)))</blockquote>
<p>
Assuming the client knows the trees have sizes 7 and 13,
it can derive the required decomposition itself.
We need only supply the hashes [h(2, 0), h(1, 2), h(0, 6), h(0, 7), h(2, 2), h(0, 12)].
The client recalculates the <i>T</i><sub>7</sub> and <i>T</i><sub>1</sub><sub>3</sub>
implied by the hashes and checks that they match the originals.
<p>
Note that these proofs only use hashes for completed subtrees—that is, numbered hashes,
never the “x” or “y” hashes that combine differently-sized subtrees.
The numbered hashes are <i>permanent</i>,
in the sense that once such a hash appears in a
tree of a given size, that same hash will appear in all
trees of larger sizes.
In contrast, the “x” and “y” hashes are <i>ephemeral</i>—computed
for a single tree and never seen again.
The hashes common to the decomposition of two different-sized trees
therefore must always be permanent hashes.
The decomposition of the larger tree could make use of ephemeral
hashes for the exposed siblings,
but we can easily use only
permanent hashes instead.
In the example above,
the reconstruction of <i>T</i><sub>1</sub><sub>3</sub>
from the parts of <i>T</i><sub>7</sub>
uses h(2, 2) and h(0, 12)
instead of assuming access to <i>T</i><sub>1</sub><sub>3</sub>’s h(3, x).
Avoiding the ephemeral hashes extends the maximum
record proof size from lg <i>N</i> hashes to 2 lg <i>N</i> hashes
and the maximum tree proof size from 2 lg <i>N</i> hashes
to 3 lg <i>N</i> hashes.
Note that most top-level hashes,
including <i>T</i><sub>7</sub> and <i>T</i><sub>1</sub><sub>3</sub>,
are themselves ephemeral hashes,
requiring up to lg <i>N</i> permanent hashes to compute.
The exceptions are the power-of-two-sized trees
<i>T</i><sub>1</sub>, <i>T</i><sub>2</sub>, <i>T</i><sub>4</sub>, <i>T</i><sub>8</sub>, and so on.
<a class=anchor href="#storing_a_log"><h2 id="storing_a_log">Storing a Log</h2></a>
<p>
Storing the log requires only a few append-only files.
The first file holds the log record data, concatenated.
The second file is an index of the first,
holding a sequence of int64 values giving the start offset
of each record in the first file.
This index allows efficient random access to any record
by its record number.
While we could recompute any hash tree from the record data alone,
doing so would require <i>N</i>–1 hash operations
for a tree of size <i>N</i>.
Efficient generation of proofs therefore requires
precomputing and storing the hash trees
in some more accessible form.
<p>
As we noted in the previous section,
there is significant commonality between trees.
In particular, the latest hash tree includes all the
permanent hashes from all earlier hash trees,
so it is enough to store “only” the latest hash tree.
A straightforward way to do this is to
maintain lg <i>N</i> append-only files, each holding
the sequence of hashes at one level of the tree.
Because hashes are fixed size,
any particular hash can be read efficiently
by reading from the file at the appropriate offset.
<p>
To write a new log record, we must
append the record data to the data file,
append the offset of that data to the index file,
and
append the hash of the data to the level-0 hash file.
Then, if we completed a pair of hashes in the level-0 hash file,
we append the hash of the pair to the level-1 hash file;
if that completed a pair of hashes in the level-1 hash file,
we append the hash of that pair to the level-2 hash file;
and so on up the tree.
Each log record write will append a hash to at least one and
at most lg <i>N</i> hash files,
with an average of just under two new hashes per write.
(A binary tree with <i>N</i> leaves has <i>N</i>–1 interior nodes.)
<p>
It is also possible to interlace lg <i>N</i> append-only hash files
into a single append-only file,
so that the log can be stored in only three files:
record data, record index, and hashes.
See Appendix A for details.
Another possibility is to store the log in a pair of database tables,
one for record data and one for hashes
(the database can provide the record index itself).
<p>
Whether in files or in database tables, the stored form
of the log is append-only, so cached data never goes stale,
making it trivial to have parallel, read-only replicas of a log.
In contrast, writing to the log is inherently centralized,
requiring a dense sequence numbering of all records
(and in many cases also duplicate suppression).
An implementation using the two-table database representation
can delegate both replication and coordination of writes
to the underlying database,
especially if the underlying database is globally-replicated and consistent,
like
<a href="https://ai.google/research/pubs/pub39966">Google Cloud Spanner</a>
or <a href="https://www.cockroachlabs.com/docs/stable/architecture/overview.html">CockroachDB</a>.
<p>
It is of course not enough just to store the log.
We must also make it available to clients.
<a class=anchor href="#serving_a_log"><h2 id="serving_a_log">Serving a Log</h2></a>
<p>
Remember that each client consuming the log is skeptical about the log’s
correct operation.
The log server must make it easy for the client to
verify two things: first, that any particular record is in the log,
and second, that the current log is an append-only
extension of a previously-observed earlier log.
<p>
To be useful, the log server must also make it easy to find a record
given some kind of lookup key,
and it must allow an auditor to iterate
over the entire log looking for entries that don’t belong.
<p>
To do all this, the log server must answer five queries:
<ol>
<li>
<p>
<i>Latest</i>() returns the current log size and top-level hash, cryptographically signed by the server for non-repudiation.
<li>
<p>
<i>RecordProof</i>(<i>R</i>, <i>N</i>) returns the proof that record <i>R</i> is contained in the tree of size <i>N</i>.
<li>
<p>
<i>TreeProof</i>(<i>N</i>, <i>N</i>′) returns the proof that the tree of size <i>N</i> is a prefix of the tree of size <i>N</i>′.
<li>
<p>
<i>Lookup</i>(<i>K</i>) returns the record index <i>R</i> matching lookup key <i>K</i>, if any.
<li>
<p>
<i>Data</i>(<i>R</i>) returns the data associated with record <i>R</i>.</ol>
<a class=anchor href="#verifying_a_log"><h2 id="verifying_a_log">Verifying a Log</h2></a>
<p>
The client uses the first three queries
to maintain a cached copy of the most recent log it has observed
and make sure that the server never removes anything
from an observed log.
To do this, the client caches the most recently observed
log size <i>N</i> and top-level hash <i>T</i>.
Then, before accepting data bits <i>B</i> as record number <i>R</i>,
the client verifies that <i>R</i> is included in that log.
If necessary (that is, if <i>R</i> ≥ its cached <i>N</i>),
the client updates its cached <i>N</i>, <i>T</i>
to those of the latest log, but only after verifying
that the latest log includes everything from the current cached log.
In pseudocode:
<pre>validate(bits B as record R):
if R ≥ cached.N:
N, T = server.Latest()
if server.TreeProof(cached.N, N) cannot be verified:
fail loudly
cached.N, cached.T = N, T
if server.RecordProof(R, cached.N) cannot be verified using B:
fail loudly
accept B as record R
</pre>
<p>
The client’s proof verification ensures that the log server is behaving
correctly, at least as observed by the client.
If a devious server can distinguish individual clients,
it can still serve different logs to different clients,
so that a victim client sees invalid entries
never exposed to other clients or auditors.
But if the server does lie to a victim,
the fact that the victim requires any later log to
include what it has seen before
means the server must keep up the lie,
forever serving an alternate log containing the lie.
This makes eventual detection more likely.
For example, if the victim ever arrived through a proxy
or compared its cached log against another client,
or if the server ever made a mistake about
which clients to lie to,
the inconsistency would be readily exposed.
Requiring the server to sign the <i>Latest</i>() response
makes it impossible for the server to disavow the inconsistency,
except by claiming to have been compromised entirely.
<p>
The client-side checks are a little bit like how
a Git client maintains its own
cached copy of a remote repository and then,
before accepting an update during <code>git</code> <code>pull</code>,
verifies that the remote repository includes all local commits.
But the transparent log client only needs to
download lg <i>N</i> hashes for the verification,
while Git downloads all <i>cached</i>.<i>N</i> – <i>N</i> new data records,
and more generally, the transparent log client
can selectively read and authenticate individual
entries from the log,
without being required to download and store
a full copy of the entire log.
<a class=anchor href="#tiling_a_log"><h2 id="tiling_a_log">Tiling a Log</h2></a>
<p>
As described above,
storing the log requires simple, append-only storage
linear in the total log size,
and serving or accessing the log requires
network traffic only logarithmic in the total log size.
This would be a completely reasonable place to stop
(and is where Certificate Transparency as defined in
<a href="https://tools.ietf.org/html/rfc6962">RFC 6962</a> stops).
However, one useful optimization can both cut the hash storage in half
and make the network traffic more cache-friendly,
with only a minor increase in implementation complexity.
That optimization is based on splitting the hash tree into tiles,
like
<a href="https://medium.com/google-design/google-maps-cb0326d165f5#ccfa">Google Maps splits the globe into tiles</a>.
<p>
A binary tree can be split into tiles of fixed height <i>H</i>
and width 2<sup><i>H</i></sup>.
For example, here is the permanent hash tree for the log with 27 records,
split into tiles of height 2:
<p>
<img name="tlog-tile-27" class="center pad" width=847 height=236 src="tlog-tile-27.png" srcset="tlog-tile-27.png 1x, tlog-tile-27@1.5x.png 1.5x, tlog-tile-27@2x.png 2x, tlog-tile-27@3x.png 3x, tlog-tile-27@4x.png 4x">
<p>
We can assign each tile a two-dimensional coordinate, analogous to the hash coordinates we’ve been using:
tile(<i>L</i>, <i>K</i>) denotes the tile at tile level <i>L</i>
(hash levels <i>H</i>·<i>L</i> up to <i>H</i>·(<i>L</i>+1)), <i>K</i>th from the left.
For any given log size, the rightmost tile at each level
may not yet be complete:
the bottom row of hashes may contain only <i>W</i> < 2<sup><i>H</i></sup> hashes.
In that case we will write tile(<i>L</i>, <i>K</i>)/<i>W</i>.
(When the tile is complete, the “/<i>W</i>” is omitted, understood to be 2<sup><i>H</i></sup>.)
<a class=anchor href="#storing_tiles"><h2 id="storing_tiles">Storing Tiles</h2></a>
<p>
Only the bottom row of each tile needs to be stored:
the upper rows can be recomputed by hashing lower ones.
In our example, a tile of height two stores 4 hashes instead of 6,
a 33% storage reduction.
For tiles of greater heights,
the storage reduction asymptotically approaches 50%.
The cost is that reading a hash that has been optimized
away may require reading as much as half a tile,
increasing I/O requirements.
For a real system, height four seems like a reasonable balance between storage costs and increased I/O overhead.
It stores 16 hashes instead of 30—a 47% storage reduction—and
(assuming SHA-256) a single 16-hash tile is only 512 bytes
(a single disk sector!).
<p>
The file storage described earlier maintained lg <i>N</i> hash files,
one for each level.
Using tiled storage,
we only write the hash files
for levels that are a multiple of the tile height.
For tiles of height 4, we’d only write the hash files
for levels 0, 4, 8, 12, 16, and so on.
When we need a hash at another level,
we can read its tile and recompute the hash.
<a class=anchor href="#serving_tiles"><h2 id="serving_tiles">Serving Tiles</h2></a>
<p>
The proof-serving requests
<i>RecordProof</i>(<i>R</i>, <i>N</i>) and
<i>TreeProof</i>(<i>N</i>, <i>N</i>′)
are not particularly cache-friendly.
For example, although <i>RecordProof</i>(<i>R</i>, <i>N</i>)
often shares many hashes
with both <i>RecordProof</i>(<i>R</i>+1, <i>N</i>)
and <i>RecordProof</i>(<i>R</i>, <i>N</i>+1),
the three are distinct requests that must be cached independently.
<p>
A more cache-friendly approach would be
to replace <i>RecordProof</i> and <i>TreeProof</i> by a general
request <i>Hash</i>(<i>L</i>, <i>K</i>),
serving a single permanent hash.
The client can easily compute which specific hashes it needs,
and there are many fewer individual hashes
than whole proofs (2 <i>N</i> vs <i>N</i><sup>2</sup>/2),
which will help the cache hit rate.
Unfortunately, switching to <i>Hash</i> requests is inefficient:
obtaining a record proof used to take one request
and now takes up to 2 lg <i>N</i> requests, while
tree proofs take up to 3 lg <i>N</i> requests.
Also, each request delivers only a single hash (32 bytes):
the request overhead is likely significantly larger
than the payload.
<p>
We can stay cache-friendly
while reducing the number of requests
and the relative request overhead,
at a small cost in bandwidth,
by adding a request <i>Tile</i>(<i>L</i>, <i>K</i>)
that returns the requested tile.
The client can request the tiles it needs for a given proof,
and it can cache tiles, especially those higher in the tree,
for use in future proofs.
<p>
For a real system using SHA-256, a tile of height 8 would be 8 kB.
A typical proof in a large log of, say, 100 million records would
require only three complete tiles, or 24 kB downloaded,
plus one incomplete tile (192 bytes) for the top of the tree.
And tiles of height 8 can be served directly from
stored tiles of height 4 (the size suggested in the previous section).
Another reasonable choice would be to both store and serve tiles of height 6 (2 kB each) or 7 (4 kB each).
<p>
If there are caches in front of the server,
each differently-sized partial tile must be given
a different name,
so that a client that needs a larger partial tile
is not given a stale smaller one.
Even though the tile height is conceptually constant for a given system,
it is probably helpful to be explicit about the tile height
in the request, so that a system can transition from one
fixed tile height to another without ambiguity.
For example, in a simple GET-based HTTP API,
we could use <code>/tile/H/L/K</code> to name a complete tile
and <code>/tile/H/L/K.W</code> to name a partial tile with only
<i>W</i> hashes.
<a class=anchor href="#authenticating_tiles"><h2 id="authenticating_tiles">Authenticating Tiles</h2></a>
<p>
One potential problem with downloading and caching tiles
is not being sure that they are correct.
An attacker might be able to modify downloaded tiles
and cause proofs to fail unexpectedly.
We can avoid this problem by authenticating the tiles
against the signed top-level tree hash after downloading them.
Specifically, if we have a signed top-level tree hash <i>T</i>,
we first download the at most (lg <i>N</i>)/<i>H</i> tiles storing
the hashes for the complete subtrees that make up <i>T</i>.
In the diagram of <i>T</i><sub>2</sub><sub>7</sub> earlier, that would be tile(2, 0)/1,
tile(1, 1)/2, and tile(0, 6)/3.
Computing <i>T</i> will use every hash in these tiles;
if we get the right <i>T</i>, the hashes are all correct.
These tiles make up the top and right sides
of the tile tree for the given hash tree, and now we know they are correct.
To authenticate any other tile,
we first authenticate its parent tile
(the topmost parents are all authenticated already)
and then check that the result of hashing all the hashes
in the tile produces the corresponding entry in the parent tile.
Using the <i>T</i><sub>2</sub><sub>7</sub> example again,
given a downloaded tile purporting to be tile(0, 1), we can compute<blockquote>
<p>
h(2, 1) = H(H(h(0, 4), h(0, 5)), H(h(0, 6), h(0, 7)))</blockquote>
<p>
and check whether that value matches the h(2, 1)
recorded directly in an already-authenticated tile(1, 0).
If so, that authenticates the downloaded tile.
<a class=anchor href="#summary"><h2 id="summary">Summary</h2></a>
<p>
Putting this all together, we’ve seen how to publish a
transparent (tamper-evident, immutable, append-only) log
with the following properties:
<ul>
<li>
A client can verify any particular record using <i>O</i>(lg <i>N</i>) downloaded bytes.
<li>
A client can verify any new log contains an older log using <i>O</i>(lg <i>N</i>) downloaded bytes.
<li>
For even a large log, these verifications can be done in 3 RPCs of about 8 kB each.
<li>
The RPCs used for verification can be made to proxy and cache well, whether for network efficiency or possibly for privacy.
<li>
Auditors can iterate over the entire log looking for bad entries.
<li>
Writing <i>N</i> records defines a sequence of <i>N</i> hash trees, in which the <i>n</i>th tree contains 2 <i>n</i> – 1 hashes, a total of <i>N</i><sup>2</sup> hashes. But instead of needing to store <i>N</i><sup>2</sup> hashes, the entire sequence can be compacted into at most 2 <i>N</i> hashes, with at most lg <i>N</i> reads required to reconstruct a specific hash from a specific tree.
<li>
Those 2 <i>N</i> hashes can themselves be compacted down to 1.06 <i>N</i> hashes, at a cost of potentially reading 8 adjacent hashes to reconstruct any one hash from the 2 <i>N</i>.</ul>
<p>
Overall, this structure makes the log server itself essentially untrusted.
It can’t remove an observed record without detection.
It can’t lie to one client without keeping the client on an alternate timeline forever,
making detection easy by comparing against another client.
The log itself is also easily proxied and cached,
so that even if the main server disappeared,
replicas could keep serving the cached log.
Finally, auditors can check the log for entries that should not be there,
so that the actual content of the log can be verified asynchronously
from its use.
<a class=anchor href="#further_reading"><h2 id="further_reading">Further Reading</h2></a>
<p>
The original sources needed to understand this data structure are
all quite readable and repay careful study.
Ralph Merkle introduced Merkle trees in his
Ph.D. thesis,
“<a href="http://www.merkle.com/papers/Thesis1979.pdf">Secrecy, authentication, and public-key systems</a>” (1979),
using them to convert a digital signature scheme
with single-use public keys into one with multiple-use keys.
The multiple-use key was the top-level hash of a tree of 2<sup><i>L</i></sup> pseudorandomly
generated single-use keys.
Each signature began with a specific single-use key,
its index <i>K</i> in the tree,
and a proof (consisting of <i>L</i> hashes)
authenticating the key as record <i>K</i> in the tree.
Adam Langley’s blog post
“<a href="https://www.imperialviolet.org/2013/07/18/hashsig.html">Hash based signatures</a>” (2013)
gives a short introduction to
the single-use signature scheme and
how Merkle’s tree helped.
<p>
Scott Crosby and Dan Wallach
introduced the idea of using a Merkle tree to store a verifiably append-only log
in their paper,
“<a href="http://static.usenix.org/event/sec09/tech/full_papers/crosby.pdf">Efficient Data Structures for Tamper-Evident Logging</a>” (2009).
The key advance was the efficient proof
that one tree’s log is contained as a prefix of a larger tree’s log.
<p>
Ben Laurie, Adam Langley, and Emilia Kasper
adopted this verifiable, transparent log
in the design for
<a href="https://www.certificate-transparency.org/">Certificate Transparency (CT) system</a> (2012),
detailed in
<a href="https://tools.ietf.org/html/rfc6962">RFC 6962</a> (2013).
CT’s computation of the top-level hashes
for non-power-of-two-sized logs differs
in minor ways from Crosby and Wallach’s paper;
this post used the CT definitions.
Ben Laurie's ACM Queue article, “<a href="https://queue.acm.org/detail.cfm?id=2668154">Certificate Transparency: Public, verifiable, append-only logs</a>” (2014),
presents a high-level overview and additional motivation and context.
<p>
Adam Eijdenberg, Ben Laurie, and Al Cutter’s paper
“<a href="https://github.com/google/trillian/blob/master/docs/papers/VerifiableDataStructures.pdf">Verifiable Data Structures</a>” (2015),
presents Certificate Transparency’s log
as a general building block—a transparent log—for use in a variety of systems.
It also introduces an analogous transparent map
from arbitrary keys to arbitrary values,
perhaps a topic for a future post.
<p>
Google’s “General Transparency” server, <a href="https://github.com/google/trillian/blob/master/README.md">Trillian</a>,
is a production-quality storage implementation
for both transparent logs and transparent maps.
The RPC service serves proofs, not hashes or tiles,
but the server <a href="https://github.com/google/trillian/blob/master/docs/storage/storage.md">uses tiles in its internal storage</a>.
<p>
To authenticate modules (software packages)
in the Go language ecosystem, we are
<a href="https://blog.golang.org/modules2019">planning to use a transparent log</a>
to store the expected cryptographic hashes of specific module versions,
so that a client can be cryptographically certain
that it will download the same software tomorrow
that it downloaded today.
For that system’s network service,
we plan to serve tiles directly, not proofs.
This post effectively serves as an extended explanation
of the transparent log, for reference from
<a href="https://golang.org/design/25530-notary">the Go-specific design</a>.
<a class=anchor href="#appendix_a"><h2 id="appendix_a">Appendix A: Postorder Storage Layout</h2></a>
<p>
The file-based storage described earlier held the permanent hash tree
in lg <i>N</i> append-only files, one for each level of the tree.
The hash h(<i>L</i>, <i>K</i>) would be stored in the <i>L</i>th hash file
at offset <i>K</i> · <i>HashSize</i>
<p>
Crosby and Wallach pointed out that it is easy to merge the lg <i>N</i> hash tree levels
into a single, append-only hash file by using the postorder numbering of
the binary tree, in which a parent hash is stored immediately after its
rightmost child.
For example, the permanent hash tree after writing <i>N</i> = 13 records is laid out like:
<p>
<img name="tlog-post-13" class="center pad" width=560 height=157 src="tlog-post-13.png" srcset="tlog-post-13.png 1x, tlog-post-13@1.5x.png 1.5x, tlog-post-13@2x.png 2x, tlog-post-13@3x.png 3x, tlog-post-13@4x.png 4x">
<p>
In the diagram, each hash is numbered
and aligned horizontally according to its location in the interlaced file.
<p>
The postorder numbering makes the hash file append-only:
each new record completes between 1 and lg <i>N</i> new hashes
(on average 2),
which are simply appended to the file,
lower levels first.
<p>
Reading a specific hash from the file can still be done with a single read
at a computable offset, but the calculation is no longer completely trivial.
Hashes at level 0 are placed by adding in gaps for
completed higher-level hashes,
and a hash at any higher level immediately
follows its right child hash:<blockquote>
<p>
seq(0, <i>K</i>) = <i>K</i> + <i>K</i>/2 + <i>K</i>/4 + <i>K</i>/8 + ... <br>
seq(<i>L</i>, <i>K</i>) = seq(<i>L</i>–1, 2 <i>K</i> + 1) + 1 = seq(0, 2<sup><i>L</i></sup> (<i>K</i>+1) – 1) + <i>L</i></blockquote>
<p>
The interlaced layout also improves locality of access.
Reading a proof typically means reading one hash
from each level,
all clustered around a particular leaf in the tree.
If each tree level is stored separately,
each hash is in a different file and there is no possibility of I/O overlap.
But when the tree is stored in interlaced form,
the accesses at the bottom levels will all be near each other,
making it possible to fetch many of the needed hashes
with a single disk read.
<a class=anchor href="#appendix_b"><h2 id="appendix_b">Appendix B: Inorder Storage Layout</h2></a>
<p>
A different way to interlace the lg <i>N</i> hash files
would be to use an inorder tree numbering,
in which each parent hash is stored between its left
and right subtrees:
<p>
<img name="tlog-in-13" class="center pad" width=602 height=157 src="tlog-in-13.png" srcset="tlog-in-13.png 1x, tlog-in-13@1.5x.png 1.5x, tlog-in-13@2x.png 2x, tlog-in-13@3x.png 3x, tlog-in-13@4x.png 4x">
<p>
This storage order does not correspond to append-only writes to the file,
but each hash entry is still write-once.
For example, with 13 records written, as in the diagram,
hashes have been stored at indexes 0–14, 16–22 and 24,
but not yet at indexes 15 and 23,
which will eventually hold
h(4, 0) and h(3, 1).
In effect, the space for a parent hash is reserved
when its left subtree has been completed,
but it can only be filled in later, once its right subtree has also been completed.
<p>
Although the file is no longer append-only, the inorder numbering
has other useful properties.
First, the offset math is simpler:<blockquote>
<p>
seq(0, <i>K</i>) = 2 <i>K</i> <br>
seq(<i>L</i>, <i>K</i>) = 2<sup><i>L</i>+1</sup> <i>K</i> + 2<sup><i>L</i></sup> – 1</blockquote>
<p>
Second, locality is improved.
Now each parent hash sits exactly in the middle of its
child subtrees,
instead of on the far right side.
<a class=anchor href="#appendix_c"><h2 id="appendix_c">Appendix C: Tile Storage Layout</h2></a>
<p>
Storing the hash tree in lg <i>N</i> separate levels made
converting to tile storage very simple: just don’t write (<i>H</i>–1)/<i>H</i> of the files.
The simplest tile implementation is probably to use separate files,
but it is worth examining what it would take to convert an
interlaced hash storage file to tile storage.
It’s not as straightforward as omitting a few files.
It’s not enough to just omit the hashes at certain levels:
we also want each tile to appear contiguously in the file.
For example, for tiles of height 2,
the first tile at tile level 1 stores hashes h(2, 0)–h(2, 3),
but neither the postorder nor inorder interlacing
would place those four hashes next to each other.
<p>
Instead, we must simply define that tiles are stored contiguously
and then decide a linear tile layout order.
For tiles of height 2, the tiles form a 4-ary tree,
and in general, the tiles form a 2<sup><i>H</i></sup>-ary tree.
We could use a postorder layout, as in Appendix A:<blockquote>
<p>
seq(0, <i>K</i>) = <i>K</i> + <i>K</i>/2<sup><i>H</i></sup> + <i>K</i>/2<sup>2<i>H</i></sup> + <i>K</i>/2<sup>3<i>H</i></sup> + ... <br>
seq(<i>L</i>, <i>K</i>) = seq(<i>L</i>–1, 2<sup><i>H</i></sup> <i>K</i> + 2<sup><i>H</i></sup> – 1) + 1 = seq(0, 2<sup><i>H</i>·<i>L</i></sup> (<i>K</i>+1) – 1) + <i>L</i></blockquote>
<p>
The postorder tile sequence places a parent tile
immediately after its rightmost child tile,
but the parent tile begins to be written
after the leftmost child tile is completed.
This means writing increasingly far ahead of the
filled part of the hash file.
For example, with tiles of height 2,
the first hash of
tile(2, 0) (postorder index 20) is written after filling tile(1, 0) (postorder index 4):
<p>
<img name="tlog-tile-post-16" class="center pad" width=498 height=126 src="tlog-tile-post-16.png" srcset="tlog-tile-post-16.png 1x, tlog-tile-post-16@1.5x.png 1.5x, tlog-tile-post-16@2x.png 2x, tlog-tile-post-16@3x.png 3x, tlog-tile-post-16@4x.png 4x">
<p>
The hash file catches up—there are no tiles written after index 20 until the hash file fills in entirely behind it—but
then jumps ahead again—finishing tile 20 triggers writing the first hash into tile 84.
In general only the first 1/2<sup><i>H</i></sup> or so of the hash file is guaranteed to be densely packed.
Most file systems efficiently support files with large holes,
but not all do:
we may want to use a different tile layout to avoid arbitrarily large holes.
<p>
Placing a parent tile immediately after its leftmost child’s completed subtree
would eliminate all holes (other than incomplete tiles) and would seem to correspond to
the inorder layout of Appendix B:
<p>
<img name="tlog-tile-in1-16" class="center pad" width=498 height=126 src="tlog-tile-in1-16.png" srcset="tlog-tile-in1-16.png 1x, tlog-tile-in1-16@1.5x.png 1.5x, tlog-tile-in1-16@2x.png 2x, tlog-tile-in1-16@3x.png 3x, tlog-tile-in1-16@4x.png 4x">
<p>
But while the tree structure is regular,
the numbering is not.
Instead, the offset math is more like the postorder traversal.
A simpler but far less obvious alternative is to vary the exact
placement of the parent tiles relative to the subtrees:
<p>
<img name="tlog-tile-code-16" class="center pad" width=498 height=126 src="tlog-tile-code-16.png" srcset="tlog-tile-code-16.png 1x, tlog-tile-code-16@1.5x.png 1.5x, tlog-tile-code-16@2x.png 2x, tlog-tile-code-16@3x.png 3x, tlog-tile-code-16@4x.png 4x"><blockquote>
<p>
seq(<i>L</i>, <i>K</i>) = ((<i>K</i> + <i>B</i> – 2)/(<i>B</i> – 1))<sub><i>B</i></sub> || (1)<sub><i>B</i></sub><sup>L</sup></blockquote>
<p>
Here, (<i>X</i>)<sub><i>B</i></sub> means <i>X</i> written as a base-<i>B</i> number,
|| denotes concatenation of base-<i>B</i> numbers,
(1)<sub><i>B</i></sub><sup>L</sup>
means the base-<i>B</i> digit 1 repeated <i>L</i> times,
and the base is <i>B</i> = 2<sup><i>H</i></sup>.
<p>
This encoding generalizes the inorder binary-tree traversal (<i>H</i> = 1, <i>B</i> = 2),
preserving its regular offset math
at the cost of losing its regular tree structure.
Since we only care about doing the math,
not exactly what the tree looks like,
this is probably a reasonable tradeoff.
For more about this surprising ordering,
see my blog post,
“<a href="https://research.swtch.com/treenum">An Encoded Tree Traversal</a>.”
Our Software Dependency Problem
tag:research.swtch.com,2012:research.swtch.com/deps
2019-01-23T11:00:00-05:00
2019-01-23T11:02:00-05:00
Download and run code from strangers on the internet. What could go wrong?
<p>
For decades, discussion of software reuse was far more common than actual software reuse.
Today, the situation is reversed: developers reuse software written by others every day,
in the form of software dependencies,
and the situation goes mostly unexamined.
<p>
My own background includes a decade of working with
Google’s internal source code system,
which treats software dependencies as a first-class concept,<a class=footnote id=body1 href="#note1"><sup>1</sup></a>
and also developing support for
dependencies in the Go programming language.<a class=footnote id=body2 href="#note2"><sup>2</sup></a>
<p>
Software dependencies carry with them
serious risks that are too often overlooked.
The shift to easy, fine-grained software reuse has happened so quickly
that we do not yet understand the best practices for choosing
and using dependencies effectively,
or even for deciding when they are appropriate and when not.
My purpose in writing this article is to raise awareness of the risks
and encourage more investigation of solutions.
<a class=anchor href="#what_is_a_dependency"><h2 id="what_is_a_dependency">What is a dependency?</h2></a>
<p>
In today’s software development world,
a <i>dependency</i> is additional code that you want to call from your program.
Adding a dependency avoids repeating work already done:
designing, writing, testing, debugging, and maintaining a specific
unit of code.
In this article we’ll call that unit of code a <i>package</i>;
some systems use terms like library or module instead of package.
<p>
Taking on externally-written dependencies is an old practice:
most programmers have at one point in their careers
had to go through the steps of manually downloading and installing
a required library, like C’s PCRE or zlib, or C++’s Boost or Qt,
or Java’s JodaTime or JUnit.
These packages contain high-quality, debugged code
that required significant expertise to develop.
For a program that needs the functionality provided by one of these packages,
the tedious work of manually downloading, installing, and updating
the package
is easier than the work of redeveloping that functionality from scratch.
But the high fixed costs of reuse
mean that manually-reused packages tend to be big:
a tiny package would be easier to reimplement.
<p>
A <i>dependency manager</i>
(sometimes called a package manager)
automates the downloading and installation of dependency packages.
As dependency managers
make individual packages easier to download and install,
the lower fixed costs make
smaller packages economical to publish and reuse.
<p>
For example, the Node.js dependency manager NPM provides
access to over 750,000 packages.
One of them, <code>escape-string-regexp</code>,
provides a single function that escapes regular expression
operators in its input.
The entire implementation is:
<pre>var matchOperatorsRe = /[|\\{}()[\]^$+*?.]/g;
module.exports = function (str) {
if (typeof str !== 'string') {
throw new TypeError('Expected a string');
}
return str.replace(matchOperatorsRe, '\\$&');
};
</pre>
<p>
Before dependency managers, publishing an eight-line code library
would have been unthinkable: too much overhead for too little benefit.
But NPM has driven the overhead approximately to zero,
with the result that nearly-trivial functionality
can be packaged and reused.
In late January 2019, the <code>escape-string-regexp</code> package
is explicitly depended upon by almost a thousand
other NPM packages,
not to mention all the packages developers write for their own use
and don’t share.
<p>
Dependency managers now exist for essentially every programming language.
Maven Central (Java),
Nuget (.NET),
Packagist (PHP),
PyPI (Python),
and RubyGems (Ruby)
each host over 100,000 packages.
The arrival of this kind of fine-grained, widespread software reuse
is one of the most consequential shifts in software development
over the past two decades.
And if we’re not more careful, it will lead to serious problems.
<a class=anchor href="#what_could_go_wrong"><h2 id="what_could_go_wrong">What could go wrong?</h2></a>
<p>
A package, for this discussion, is code you download from the internet.
Adding a package as a dependency outsources the work of developing that
code—designing, writing, testing, debugging, and maintaining—to
someone else on the internet,
someone you often don’t know.
By using that code, you are exposing your own program
to all the failures and flaws in the dependency.
Your program’s execution now literally <i>depends</i>
on code downloaded from this stranger on the internet.
Presented this way, it sounds incredibly unsafe.
Why would anyone do this?
<p>
We do this because it’s easy,
because it seems to work,
because everyone else is doing it too,
and, most importantly, because
it seems like a natural continuation of
age-old established practice.
But there are important differences we’re ignoring.
<p>
Decades ago, most developers already
trusted others to write software they depended on,
such as operating systems and compilers.
That software was bought from known sources,
often with some kind of support agreement.
There was still a potential for bugs or outright mischief,<a class=footnote id=body3 href="#note3"><sup>3</sup></a>
but at least we knew who we were dealing with and usually
had commercial or legal recourses available.
<p>
The phenomenon of open-source software,
distributed at no cost over the internet,
has displaced many of those earlier software purchases.
When reuse was difficult, there were fewer projects publishing reusable code packages.
Even though their licenses typically disclaimed, among other things,
any “implied warranties of merchantability and fitness for
a particular purpose,”
the projects built up well-known reputations
that often factored heavily into people’s decisions about which to use.
The commercial and legal support for trusting our software sources
was replaced by reputational support.
Many common early packages still enjoy good reputations:
consider BLAS (published 1979), Netlib (1987), libjpeg (1991),
LAPACK (1992), HP STL (1994), and zlib (1995).
<p>
Dependency managers have scaled this open-source code reuse model down:
now, developers can share code at the granularity of
individual functions of tens of lines.
This is a major technical accomplishment.
There are myriad available packages,
and writing code can involve such a large number of them,
but the commercial, legal, and reputational support mechanisms
for trusting the code have not carried over.
We are trusting more code with less justification for doing so.
<p>
The cost of adopting a bad dependency can be viewed
as the sum, over all possible bad outcomes,
of the cost of each bad outcome
multiplied by its probability of happening (risk).
<p>
<img name="deps-cost" class="center pad" width=383 height=95 src="deps-cost.png" srcset="deps-cost.png 1x, deps-cost@1.5x.png 1.5x, deps-cost@2x.png 2x, deps-cost@3x.png 3x, deps-cost@4x.png 4x">
<p>
The context where a dependency will be used
determines the cost of a bad outcome.
At one end of the spectrum is a personal hobby project,
where the cost of most bad outcomes
is near zero:
you’re just having fun, bugs have no real impact other than
wasting some time, and even debugging them can be fun.
So the risk probability almost doesn’t matter: it’s being multiplied by zero.
At the other end of the spectrum is production software
that must be maintained for years.
Here, the cost of a bug in
a dependency can be very high:
servers may go down,
sensitive data may be divulged,
customers may be harmed,
companies may fail.
High failure costs make it much more important
to estimate and then reduce any risk of a serious failure.
<p>
No matter what the expected cost,
experiences with larger dependencies
suggest some approaches for
estimating and reducing the risks of adding a software dependency.
It is likely that better tooling is needed to help reduce
the costs of these approaches,
much as dependency managers have focused to date on
reducing the costs of download and installation.
<a class=anchor href="#inspect_the_dependency"><h2 id="inspect_the_dependency">Inspect the dependency</h2></a>
<p>
You would not hire a software developer you’ve never heard of
and know nothing about.
You would learn more about them first:
check references, conduct a job interview,
run background checks, and so on.
Before you depend on a package you found on the internet,
it is similarly prudent
to learn a bit about it first.
<p>
A basic inspection can give you a sense
of how likely you are to run into problems trying to use this code.
If the inspection reveals likely minor problems,
you can take steps to prepare for or maybe avoid them.
If the inspection reveals major problems,
it may be best not to use the package:
maybe you’ll find a more suitable one,
or maybe you need to develop one yourself.
Remember that open-source packages are published
by their authors in the hope that they will be useful
but with no guarantee of usability or support.
In the middle of a production outage, you’ll be the one debugging it.
As the original GNU General Public License warned,
“The entire risk as to the quality and performance of the
program is with you.
Should the program prove defective, you assume the cost of all
necessary servicing, repair or correction.”<a class=footnote id=body4 href="#note4"><sup>4</sup></a>
<p>
The rest of this section outlines some considerations when inspecting a package
and deciding whether to depend on it.
<a class=anchor href="#design"><h3 id="design">Design</h3></a>
<p>
Is package’s documentation clear? Does the API have a clear design?
If the authors can explain the package’s API and its design well to you, the user,
in the documentation,
that increases the likelihood they have explained the implementation well to the computer, in the source code.
Writing code for a clear, well-designed API is also easier, faster, and hopefully less error-prone.
Have the authors documented what they expect from client code
in order to make future upgrades compatible?
(Examples include the C++<a class=footnote id=body5 href="#note5"><sup>5</sup></a> and Go<a class=footnote id=body6 href="#note6"><sup>6</sup></a> compatibility documents.)
<a class=anchor href="#code_quality"><h3 id="code_quality">Code Quality</h3></a>
<p>
Is the code well-written?
Read some of it.
Does it look like the authors have been careful, conscientious, and consistent?
Does it look like code you’d want to debug? You may need to.
<p>
Develop your own systematic ways to check code quality.
For example, something as simple as compiling a C or C++ program with
important compiler warnings enabled (for example, <code>-Wall</code>)
can give you a sense of how seriously the developers work to avoid
various undefined behaviors.
Recent languages like Go, Rust, and Swift use an <code>unsafe</code> keyword to mark
code that violates the type system; look to see how much unsafe code there is.
More advanced semantic tools like Infer<a class=footnote id=body7 href="#note7"><sup>7</sup></a> or SpotBugs<a class=footnote id=body8 href="#note8"><sup>8</sup></a> are helpful too.
Linters are less helpful: you should ignore rote suggestions
about topics like brace style and focus instead on semantic problems.
<p>
Keep an open mind to development practices you may not be familiar with.
For example, the SQLite library ships as a single 200,000-line C source file
and a single 11,000-line header, the “amalgamation.”
The sheer size of these files should raise an initial red flag,
but closer investigation would turn up the
actual development source code, a traditional file tree with
over a hundred C source files, tests, and support scripts.
It turns out that the single-file distribution is built automatically from the original sources
and is easier for end users, especially those without dependency managers.
(The compiled code also runs faster, because the compiler can see more optimization opportunities.)
<a class=anchor href="#testing"><h3 id="testing">Testing</h3></a>
<p>
Does the code have tests?
Can you run them?
Do they pass?
Tests establish that the code’s basic functionality is correct,
and they signal that the developer is serious about keeping it correct.
For example, the SQLite development tree has an incredibly thorough test suite
with over 30,000 individual test cases
as well as developer documentation explaining the testing strategy.<a class=footnote id=body9 href="#note9"><sup>9</sup></a>
On the other hand,
if there are few tests or no tests, or if the tests fail, that’s a serious red flag:
future changes to the package
are likely to introduce regressions that could easily have been caught.
If you insist on tests in code you write yourself (you do, right?),
you should insist on tests in code you outsource to others.
<p>
Assuming the tests exist, run, and pass, you can gather more
information by running them with run-time instrumentation
like code coverage analysis, race detection,<a class=footnote id=body10 href="#note10"><sup>10</sup></a>
memory allocation checking,
and memory leak detection.
<a class=anchor href="#debugging"><h3 id="debugging">Debugging</h3></a>
<p>
Find the package’s issue tracker.
Are there many open bug reports? How long have they been open?
Are there many fixed bugs? Have any bugs been fixed recently?
If you see lots of open issues about what look like real bugs,
especially if they have been open for a long time,
that’s not a good sign.
On the other hand, if the closed issues show that bugs are
rarely found and promptly fixed,
that’s great.
<a class=anchor href="#maintenance"><h3 id="maintenance">Maintenance</h3></a>
<p>
Look at the package’s commit history.
How long has the code been actively maintained?
Is it actively maintained now?
Packages that have been actively maintained for an extended
amount of time are more likely to continue to be maintained.
How many people work on the package?
Many packages are personal projects that developers
create and share for fun in their spare time.
Others are the result of thousands of hours of work
by a group of paid developers.
In general, the latter kind of package is more likely to have
prompt bug fixes, steady improvements, and general upkeep.
<p>
On the other hand, some code really is “done.”
For example, NPM’s <code>escape-string-regexp</code>,
shown earlier, may never need to be modified again.
<a class=anchor href="#usage"><h3 id="usage">Usage</h3></a>
<p>
Do many other packages depend on this code?
Dependency managers can often provide statistics about usage,
or you can use a web search to estimate how often
others write about using the package.
More users should at least mean more people for whom
the code works well enough,
along with faster detection of new bugs.
Widespread usage is also a hedge against the question of continued maintenance:
if a widely-used package loses its maintainer,
an interested user is likely to step forward.
<p>
For example, libraries like PCRE or Boost or JUnit
are incredibly widely used.
That makes it more likely—although certainly not guaranteed—that
bugs you might otherwise run into have already been fixed,
because others ran into them first.
<a class=anchor href="#security"><h3 id="security">Security</h3></a>
<p>
Will you be processing untrusted inputs with the package?
If so, does it seem to be robust against malicious inputs?
Does it have a history of security problems
listed in the National Vulnerability Database (NVD)?<a class=footnote id=body11 href="#note11"><sup>11</sup></a>
<p>
For example, when Jeff Dean and I started work on
Google Code Search<a class=footnote id=body12 href="#note12"><sup>12</sup></a>—<code>grep</code> over public source code—in 2006,
the popular PCRE regular expression library seemed like an obvious choice.
In an early discussion with Google’s security team, however,
we learned that PCRE had a history of problems like buffer overflows,
especially in its parser.
We could have learned the same by searching for PCRE in the NVD.
That discovery didn’t immediately cause us to abandon PCRE,
but it did make us think more carefully about testing and isolation.
<a class=anchor href="#licensing"><h3 id="licensing">Licensing</h3></a>
<p>
Is the code properly licensed?
Does it have a license at all?
Is the license acceptable for your project or company?
A surprising fraction of projects on GitHub have no clear license.
Your project or company may impose further restrictions on the
allowed licenses of dependencies.
For example, Google disallows the use of code licensed under
AGPL-like licenses (too onerous) as well as WTFPL-like licenses (too vague).<a class=footnote id=body13 href="#note13"><sup>13</sup></a>
<a class=anchor href="#dependencies"><h3 id="dependencies">Dependencies</h3></a>
<p>
Does the code have dependencies of its own?
Flaws in indirect dependencies are just as bad for your program
as flaws in direct dependencies.
Dependency managers can list all the transitive dependencies
of a given package, and each of them should ideally be inspected as
described in this section.
A package with many dependencies incurs additional inspection work,
because those same dependencies incur additional risk
that needs to be evaluated.
<p>
Many developers have never looked at the full list of transitive
dependencies of their code and don’t know what they depend on.
For example, in March 2016 the NPM user community discovered
that many popular projects—including Babel, Ember, and React—all depended
indirectly on a tiny package called <code>left-pad</code>,
consisting of a single 8-line function body.
They discovered this when
the author of <code>left-pad</code> deleted that package from NPM,
inadvertently breaking most Node.js users’ builds.<a class=footnote id=body14 href="#note14"><sup>14</sup></a>
And <code>left-pad</code> is hardly exceptional in this regard.
For example, 30% of the
750,000 packages published on NPM
depend—at least indirectly—on <code>escape-string-regexp</code>.
Adapting Leslie Lamport’s observation about distributed systems,
a dependency manager can easily
create a situation in which the failure of a package you didn’t
even know existed can render your own code unusable.
<a class=anchor href="#test_the_dependency"><h2 id="test_the_dependency">Test the dependency</h2></a>
<p>
The inspection process should include running a package’s own tests.
If the package passes the inspection and you decide to make your
project depend on it,
the next step should be to write new tests focused on the functionality
needed by your application.
These tests often start out as short standalone programs
written to make sure you can understand the package’s API
and that it does what you think it does.
(If you can’t or it doesn’t, turn back now!)
It is worth then taking the extra effort to turn those programs
into automated tests that can be run against newer versions of the package.
If you find a bug and have a potential fix,
you’ll want to be able to rerun these project-specific tests
easily, to make sure that the fix did not break anything else.
<p>
It is especially worth exercising the likely problem areas
identified by the
basic inspection.
For Code Search, we knew from past experience
that PCRE sometimes took
a long time to execute certain regular expression searches.
Our initial plan was to have separate thread pools for
“simple” and “complicated” regular expression searches.
One of the first tests we ran was a benchmark,
comparing <code>pcregrep</code> with a few other <code>grep</code> implementations.
When we found that, for one basic test case,
<code>pcregrep</code> was 70X slower than the
fastest <code>grep</code> available,
we started to rethink our plan to use PCRE.
Even though we eventually dropped PCRE entirely,
that benchmark remains in our code base today.
<a class=anchor href="#abstract_the_dependency"><h2 id="abstract_the_dependency">Abstract the dependency</h2></a>
<p>
Depending on a package is a decision that you are likely to
revisit later.
Perhaps updates will take the package in a new direction.
Perhaps serious security problems will be found.
Perhaps a better option will come along.
For all these reasons, it is worth the effort
to make it easy to migrate your project to a new dependency.
<p>
If the package will be used from many places in your project’s source code,
migrating to a new dependency would require making
changes to all those different source locations.
Worse, if the package will be exposed in your own project’s API,
migrating to a new dependency would require making
changes in all the code calling your API,
which you might not control.
To avoid these costs, it makes sense to
define an interface of your own,
along with a thin wrapper implementing that
interface using the dependency.
Note that the wrapper should include only
what your project needs from the dependency,
not everything the dependency offers.
Ideally, that allows you to
substitute a different, equally appropriate dependency later,
by changing only the wrapper.
Migrating your per-project tests to use the new interface
tests the interface and wrapper implementation
and also makes it easy to test any potential replacements
for the dependency.
<p>
For Code Search, we developed an abstract <code>Regexp</code> class
that defined the interface Code Search needed from any
regular expression engine.
Then we wrote a thin wrapper around PCRE
implementing that interface.
The indirection made it easy to test alternate libraries,
and it kept us from accidentally introducing knowledge
of PCRE internals into the rest of the source tree.
That in turn ensured that it would be easy to switch
to a different dependency if needed.
<a class=anchor href="#isolate_the_dependency"><h2 id="isolate_the_dependency">Isolate the dependency</h2></a>
<p>
It may also be appropriate to isolate a dependency
at run-time, to limit the possible damage caused by bugs in it.
For example, Google Chrome allows users to add dependencies—extension code—to the browser.
When Chrome launched in 2008, it introduced
the critical feature (now standard in all browsers)
of isolating each extension in a sandbox running in a separate
operating-system process.<a class=footnote id=body15 href="#note15"><sup>15</sup></a>
An exploitable bug in an badly-written extension
therefore did not automatically have access to the entire memory
of the browser itself
and could be stopped from making inappropriate system calls.<a class=footnote id=body16 href="#note16"><sup>16</sup></a>
For Code Search, until we dropped PCRE entirely,
our plan was to isolate at least the PCRE parser
in a similar sandbox.
Today,
another option would be a lightweight hypervisor-based sandbox
like gVisor.<a class=footnote id=body17 href="#note17"><sup>17</sup></a>
Isolating dependencies
reduces the associated risks of running that code.
<p>
Even with these examples and other off-the-shelf options,
run-time isolation of suspect code is still too difficult and rarely done.
True isolation would require a completely memory-safe language,
with no escape hatch into untyped code.
That’s challenging not just in entirely unsafe languages like C and C++
but also in languages that provide restricted unsafe operations,
like Java when including JNI, or like Go, Rust, and Swift
when including their “unsafe” features.
Even in a memory-safe language like JavaScript,
code often has access to far more than it needs.
In November 2018, the latest version of the NPM package <code>event-stream</code>,
which provided a functional streaming API for JavaScript events,
was discovered to contain obfuscated malicious code that had been
added two and a half months earlier.
The code, which harvested large Bitcoin wallets from users of the Copay mobile app,
was accessing system resources entirely unrelated to processing
event streams.<a class=footnote id=body18 href="#note18"><sup>18</sup></a>
One of many possible defenses to this kind of problem
would be to better restrict what dependencies can access.
<a class=anchor href="#avoid_the_dependency"><h2 id="avoid_the_dependency">Avoid the dependency</h2></a>
<p>
If a dependency seems too risky and you can’t find
a way to isolate it, the best answer may be to avoid it entirely,
or at least to avoid the parts you’ve identified as most problematic.
<p>
For example, as we better understood the risks and costs associated
with PCRE, our plan for Google Code Search evolved
from “use PCRE directly,” to “use PCRE but sandbox the parser,”
to “write a new regular expression parser but keep the PCRE execution engine,”
to “write a new parser and connect it to a different, more efficient open-source execution engine.”
Later we rewrote the execution engine as well,
so that no dependencies were left,
and we open-sourced the result: RE2.<a class=footnote id=body19 href="#note19"><sup>19</sup></a>
<p>
If you only need a
tiny fraction of a dependency,
it may be simplest to make a copy of what you need
(preserving appropriate copyright and other legal notices, of course).
You are taking on responsibility for fixing bugs, maintenance, and so on,
but you’re also completely isolated from the larger risks.
The Go developer community has a proverb about this:
“A little copying is better than a little dependency.”<a class=footnote id=body20 href="#note20"><sup>20</sup></a>
<a class=anchor href="#upgrade_the_dependency"><h2 id="upgrade_the_dependency">Upgrade the dependency</h2></a>
<p>
For a long time, the conventional wisdom about software was “if it ain’t broke, don’t fix it.”
Upgrading carries a chance of introducing new bugs;
without a corresponding reward—like a new feature you need—why take the risk?
This analysis ignores two costs.
The first is the cost of the eventual upgrade.
In software, the difficulty of making code changes does not scale linearly:
making ten small changes is less work and easier to get right
than making one equivalent large change.
The second is the cost of discovering already-fixed bugs the hard way.
Especially in a security context, where known bugs are actively exploited,
every day you wait is another day that attackers can break in.
<p>
For example, consider the year 2017 at Equifax, as recounted by executives
in detailed congressional testimony.<a class=footnote id=body21 href="#note21"><sup>21</sup></a>
On March 7, a new vulnerability in Apache Struts was disclosed, and a patched version was released.
On March 8, Equifax received a notice from US-CERT about the need to update
any uses of Apache Struts.
Equifax ran source code and network scans on March 9 and March 15, respectively;
neither scan turned up a particular group of public-facing web servers.
On May 13, attackers found the servers that Equifax’s security teams could not.
They used the Apache Struts vulnerability to breach Equifax’s network
and then steal detailed personal and financial information
about 148 million people
over the next two months.
Equifax finally noticed the breach on July 29
and publicly disclosed it on September 4.
By the end of September, Equifax’s CEO, CIO, and CSO had all resigned,
and a congressional investigation was underway.
<p>
Equifax’s experience drives home the point that
although dependency managers know the versions they are using at build time,
you need other arrangements to track that information
through your production deployment process.
For the Go language, we are experimenting with automatically
including a version manifest in every binary, so that deployment
processes can scan binaries for dependencies that need upgrading.
Go also makes that information available at run-time, so that
servers can consult databases of known bugs and self-report to
monitoring software when they are in need of upgrades.
<p>
Upgrading promptly is important, but upgrading means
adding new code to your project,
which should mean updating your evaluation of the risks
of using the dependency based on the new version.
As minimum, you’d want to skim the diffs showing the
changes being made from the current version to the
upgraded versions,
or at least read the release notes,
to identify the most likely areas of concern in the upgraded code.
If a lot of code is changing, so that the diffs are difficult to digest,
that is also information you can incorporate into your
risk assessment update.
<p>
You’ll also want to re-run the tests you’ve written
that are specific to your project,
to make sure the upgraded package is at least as suitable
for the project as the earlier version.
It also makes sense to re-run the package’s own tests.
If the package has its own dependencies,
it is entirely possible that your project’s configuration
uses different versions of those dependencies
(either older or newer ones) than the package’s authors use.
Running the package’s own tests can quickly identify problems
specific to your configuration.
<p>
Again, upgrades should not be completely automatic.
You need to verify that the upgraded versions are appropriate for
your environment before deploying them.<a class=footnote id=body22 href="#note22"><sup>22</sup></a>
<p>
If your upgrade process includes re-running the
integration and qualification tests you’ve already written for the dependency,
so that you are likely to identify new problems before they reach production,
then, in most cases, delaying an upgrade is riskier than upgrading quickly.
<p>
The window for security-critical upgrades is especially short.
In the aftermath of the Equifax breach, forensic security teams found
evidence that attackers (perhaps different ones)
had successfully exploited the Apache Struts
vulnerability on the affected servers on March 10, only three days
after it was publicly disclosed, but they’d only run a single <code>whoami</code> command.
<a class=anchor href="#watch_your_dependencies"><h2 id="watch_your_dependencies">Watch your dependencies</h2></a>
<p>
Even after all that work, you’re not done tending your dependencies.
It’s important to continue to monitor them and perhaps even
re-evaluate your decision to use them.
<p>
First, make sure that you keep using the
specific package versions you think you are.
Most dependency managers now make it easy or even automatic
to record the cryptographic hash of the expected source code
for a given package version
and then to check that hash when re-downloading the package
on another computer or in a test environment.
This ensures that your build use
the same dependency source code you inspected and tested.
These kinds of checks
prevented the <code>event-stream</code> attacker,
described earlier, from silently inserting
malicious code in the already-released version 3.3.5.
Instead, the attacker had to create a new version, 3.3.6,
and wait for people to upgrade (without looking closely at the changes).
<p>
It is also important to watch for new indirect dependencies creeping in:
upgrades can easily introduce new packages
upon which the success of your project now depends.
They deserve your attention as well.
In the case of <code>event-stream</code>, the malicious code was
hidden in a different package, <code>flatmap-stream</code>,
which the new <code>event-stream</code> release added as a
new dependency.
<p>
Creeping dependencies can also affect the size of your project.
During the development of Google’s Sawzall<a class=footnote id=body23 href="#note23"><sup>23</sup></a>—a JIT’ed
logs processing language—the authors discovered at various times that
the main interpreter binary contained not just Sawzall’s JIT
but also (unused) PostScript, Python, and JavaScript interpreters.
Each time, the culprit turned out to be unused dependencies
declared by some library Sawzall did depend on,
combined with the fact that Google’s build system
eliminated any manual effort needed to start using a new dependency..
This kind of error is the reason that the Go language
makes importing an unused package a compile-time error.
<p>
Upgrading is a natural time to revisit the decision to use a dependency that’s changing.
It’s also important to periodically revisit any dependency that <i>isn’t</i> changing.
Does it seem plausible that there are no security problems or other bugs to fix?
Has the project been abandoned?
Maybe it’s time to start planning to replace that dependency.
<p>
It’s also important to recheck the security history of each dependency.
For example, Apache Struts disclosed different major remote code execution
vulnerabilities in 2016, 2017, and 2018.
Even if you have a list of all the servers that run it and
update them promptly, that track record might make you rethink using it at all.
<a class=anchor href="#conclusion"><h2 id="conclusion">Conclusion</h2></a>
<p>
Software reuse is finally here,
and I don’t mean to understate its benefits:
it has brought an enormously positive transformation
for software developers.
Even so, we’ve accepted this transformation without
completely thinking through the potential consequences.
The old reasons for trusting dependencies are becoming less valid
at exactly the same time we have more dependencies than ever.
<p>
The kind of critical examination of specific dependencies that
I outlined in this article is a significant amount of work
and remains the exception rather than the rule.
But I doubt there are any developers who actually
make the effort to do this for every possible new dependency.
I have only done a subset of them for a subset of my own dependencies.
Most of the time the entirety of the decision is “let’s see what happens.”
Too often, anything more than that seems like too much effort.
<p>
But the Copay and Equifax attacks are clear warnings of
real problems in the way we consume software dependencies today.
We should not ignore the warnings.
I offer three broad recommendations.
<ol>
<li>
<p>
<i>Recognize the problem.</i>
If nothing else, I hope this article has convinced
you that there is a problem here worth addressing.
We need many people to focus significant effort on solving it.
<li>
<p>
<i>Establish best practices for today.</i>
We need to establish best practices for managing dependencies
using what’s available today.
This means working out processes that evaluate, reduce, and track risk,
from the original adoption decision through to production use.
In fact, just as some engineers specialize in testing,
it may be that we need engineers who specialize in managing dependencies.
<li>
<p>
<i>Develop better dependency technology for tomorrow.</i>
Dependency managers have essentially eliminated the cost of
downloading and installing a dependency.
Future development effort should focus on reducing the cost of
the kind of evaluation and maintenance necessary to use
a dependency.
For example, package discovery sites might work to find
more ways to allow developers to share their findings.
Build tools should, at the least, make it easy to run a package’s own tests.
More aggressively,
build tools and package management systems could also work together
to allow package authors to test new changes against all public clients
of their APIs.
Languages should also provide easy ways to isolate a suspect package.</ol>
<p>
There’s a lot of good software out there.
Let’s work together to find out how to reuse it safely.
<p>
<a class=anchor href="#references"><h2 id="references">References</h2></a>
<ol>
<li><a name=note1></a>
Rachel Potvin and Josh Levenberg, “Why Google Stores Billions of Lines of Code in a Single Repository,” <i>Communications of the ACM</i> 59(7) (July 2016), pp. 78-87. <a href="https://doi.org/10.1145/2854146">https://doi.org/10.1145/2854146</a> <a class=back href="#body1">(⇡)</a>
<li><a name=note2></a>
Russ Cox, “Go & Versioning,” February 2018. <a href="https://research.swtch.com/vgo">https://research.swtch.com/vgo</a> <a class=back href="#body2">(⇡)</a>
<li><a name=note3></a>
Ken Thompson, “Reflections on Trusting Trust,” <i>Communications of the ACM</i> 27(8) (August 1984), pp. 761–763. <a href="https://doi.org/10.1145/358198.358210">https://doi.org/10.1145/358198.358210</a> <a class=back href="#body3">(⇡)</a>
<li><a name=note4></a>
GNU Project, “GNU General Public License, version 1,” February 1989. <a href="https://www.gnu.org/licenses/old-licenses/gpl-1.0.html">https://www.gnu.org/licenses/old-licenses/gpl-1.0.html</a> <a class=back href="#body4">(⇡)</a>
<li><a name=note5></a>
Titus Winters, “SD-8: Standard Library Compatibility,” C++ Standing Document, August 2018. <a href="https://isocpp.org/std/standing-documents/sd-8-standard-library-compatibility">https://isocpp.org/std/standing-documents/sd-8-standard-library-compatibility</a> <a class=back href="#body5">(⇡)</a>
<li><a name=note6></a>
Go Project, “Go 1 and the Future of Go Programs,” September 2013. <a href="https://golang.org/doc/go1compat">https://golang.org/doc/go1compat</a> <a class=back href="#body6">(⇡)</a>
<li><a name=note7></a>
Facebook, “Infer: A tool to detect bugs in Java and C/C++/Objective-C code before it ships.” <a href="https://fbinfer.com/">https://fbinfer.com/</a> <a class=back href="#body7">(⇡)</a>
<li><a name=note8></a>
“SpotBugs: Find bugs in Java Programs.” <a href="https://spotbugs.github.io/">https://spotbugs.github.io/</a> <a class=back href="#body8">(⇡)</a>
<li><a name=note9></a>
D. Richard Hipp, “How SQLite is Tested.” <a href="https://www.sqlite.org/testing.html">https://www.sqlite.org/testing.html</a> <a class=back href="#body9">(⇡)</a>
<li><a name=note10></a>
Alexander Potapenko, “Testing Chromium: ThreadSanitizer v2, a next-gen data race detector,” April 2014. <a href="https://blog.chromium.org/2014/04/testing-chromium-threadsanitizer-v2.html">https://blog.chromium.org/2014/04/testing-chromium-threadsanitizer-v2.html</a> <a class=back href="#body10">(⇡)</a>
<li><a name=note11></a>
NIST, “National Vulnerability Database – Search and Statistics.” <a href="https://nvd.nist.gov/vuln/search">https://nvd.nist.gov/vuln/search</a> <a class=back href="#body11">(⇡)</a>
<li><a name=note12></a>
Russ Cox, “Regular Expression Matching with a Trigram Index, or How Google Code Search Worked,” January 2012. <a href="https://swtch.com/~rsc/regexp/regexp4.html">https://swtch.com/~rsc/regexp/regexp4.html</a> <a class=back href="#body12">(⇡)</a>
<li><a name=note13></a>
Google, “Google Open Source: Using Third-Party Licenses.” <a href="https://opensource.google.com/docs/thirdparty/licenses/#banned">https://opensource.google.com/docs/thirdparty/licenses/#banned</a> <a class=back href="#body13">(⇡)</a>
<li><a name=note14></a>
Nathan Willis, “A single Node of failure,” LWN, March 2016. <a href="https://lwn.net/Articles/681410/">https://lwn.net/Articles/681410/</a> <a class=back href="#body14">(⇡)</a>
<li><a name=note15></a>
Charlie Reis, “Multi-process Architecture,” September 2008. <a href="https://blog.chromium.org/2008/09/multi-process-architecture.html">https://blog.chromium.org/2008/09/multi-process-architecture.html</a> <a class=back href="#body15">(⇡)</a>
<li><a name=note16></a>
Adam Langley, “Chromium’s seccomp Sandbox,” August 2009. <a href="https://www.imperialviolet.org/2009/08/26/seccomp.html">https://www.imperialviolet.org/2009/08/26/seccomp.html</a> <a class=back href="#body16">(⇡)</a>
<li><a name=note17></a>
Nicolas Lacasse, “Open-sourcing gVisor, a sandboxed container runtime,” May 2018. <a href="https://cloud.google.com/blog/products/gcp/open-sourcing-gvisor-a-sandboxed-container-runtime">https://cloud.google.com/blog/products/gcp/open-sourcing-gvisor-a-sandboxed-container-runtime</a> <a class=back href="#body17">(⇡)</a>
<li><a name=note18></a>
Adam Baldwin, “Details about the event-stream incident,” November 2018. <a href="https://blog.npmjs.org/post/180565383195/details-about-the-event-stream-incident">https://blog.npmjs.org/post/180565383195/details-about-the-event-stream-incident</a> <a class=back href="#body18">(⇡)</a>
<li><a name=note19></a>
Russ Cox, “RE2: a principled approach to regular expression matching,” March 2010. <a href="https://opensource.googleblog.com/2010/03/re2-principled-approach-to-regular.html">https://opensource.googleblog.com/2010/03/re2-principled-approach-to-regular.html</a> <a class=back href="#body19">(⇡)</a>
<li><a name=note20></a>
Rob Pike, “Go Proverbs,” November 2015. <a href="https://go-proverbs.github.io/">https://go-proverbs.github.io/</a> <a class=back href="#body20">(⇡)</a>
<li><a name=note21></a>
U.S. House of Representatives Committee on Oversight and Government Reform, “The Equifax Data Breach,” Majority Staff Report, 115th Congress, December 2018. <a href="https://oversight.house.gov/report/committee-releases-report-revealing-new-information-on-equifax-data-breach/">https://oversight.house.gov/report/committee-releases-report-revealing-new-information-on-equifax-data-breach/</a> <a class=back href="#body21">(⇡)</a>
<li><a name=note22></a>
Russ Cox, “The Principles of Versioning in Go,” GopherCon Singapore, May 2018. <a href="https://www.youtube.com/watch?v=F8nrpe0XWRg">https://www.youtube.com/watch?v=F8nrpe0XWRg</a> <a class=back href="#body22">(⇡)</a>
<li><a name=note23></a>
Rob Pike, Sean Dorward, Robert Griesemer, and Sean Quinlan, “Interpreting the Data: Parallel Analysis with Sawzall,” <i>Scientific Programming Journal</i>, vol. 13 (2005). <a href="https://doi.org/10.1155/2005/962135">https://doi.org/10.1155/2005/962135</a> <a class=back href="#body23">(⇡)</a></ol>
<a class=anchor href="#coda"><h2 id="coda">Coda</h2></a>
<p>
This post is a draft of my current thinking on this topic.
I hope that sharing it will provoke productive discussion,
attract more attention to the general problem,
and help me refine my own thoughts.
I also intend to publish a revised copy of this as an article elsewhere.
For both these reasons, unlike most of my blog posts,
<i>this post is not Creative Commons-licensed</i>.
Please link people to this post instead of making a copy.
When a more final version is published, I will link to it here.
<p class=copyright>
© 2019 Russ Cox. All Rights Reserved.
What is Software Engineering?
tag:research.swtch.com,2012:research.swtch.com/vgo-eng
2018-05-30T10:00:00-04:00
2018-05-30T10:02:00-04:00
What is software engineering and what does Go mean by it? (Go & Versioning, Part 9)
<p>
Nearly all of Go’s distinctive design decisions
were aimed at making software engineering simpler and easier.
We've said this often.
The canonical reference is Rob Pike's 2012 article,
“<a href="https://talks.golang.org/2012/splash.article">Go at Google: Language Design in the Service of Software Engineering</a>.”
But what is software engineering?<blockquote>
<p>
<i>Software engineering is what happens to programming
<br>when you add time and other programmers.</i></blockquote>
<p>
Programming means getting a program working.
You have a problem to solve, you write some Go code,
you run it, you get your answer, you’re done.
That’s programming,
and that's difficult enough by itself.
But what if that code has to keep working, day after day?
What if five other programmers need to work on the code too?
Then you start to think about version control systems,
to track how the code changes over time
and to coordinate with the other programmers.
You add unit tests,
to make sure bugs you fix are not reintroduced over time,
not by you six months from now,
and not by that new team member who’s unfamiliar with the code.
You think about modularity and design patterns,
to divide the program into parts that team members
can work on mostly independently.
You use tools to help you find bugs earlier.
You look for ways to make programs as clear as possible,
so that bugs are less likely.
You make sure that small changes can be tested quickly,
even in large programs.
You're doing all of this because your programming
has turned into software engineering.
<p>
(This definition and explanation of software engineering
is my riff on an original theme by my Google colleague Titus Winters,
whose preferred phrasing is “software engineering is programming integrated over time.”
It's worth seven minutes of your time to see
<a href="https://www.youtube.com/watch?v=tISy7EJQPzI&t=8m17s">his presentation of this idea at CppCon 2017</a>,
from 8:17 to 15:00 in the video.)
<p>
As I said earlier,
nearly all of Go’s distinctive design decisions
have been motivated by concerns about software engineering,
by trying to accommodate time and other programmers
into the daily practice of programming.
<p>
For example, most people think that we format Go code with <code>gofmt</code>
to make code look nicer or to end debates among
team members about program layout.
But the <a href="https://groups.google.com/forum/#!msg/golang-nuts/HC2sDhrZW5Y/7iuKxdbLExkJ">most important reason for <code>gofmt</code></a>
is that if an algorithm defines how Go source code is formatted,
then programs, like <code>goimports</code> or <code>gorename</code> or <code>go</code> <code>fix</code>,
can edit the source code more easily,
without introducing spurious formatting changes when writing the code back.
This helps you maintain code over time.
<p>
As another example, Go import paths are URLs.
If code said <code>import</code> <code>"uuid"</code>,
you’d have to ask which <code>uuid</code> package.
Searching for <code>uuid</code> on <a href="https://godoc.org">godoc.org</a> turns up dozens of packages.
If instead the code says <code>import</code> <code>"github.com/pborman/uuid"</code>,
now it’s clear which package we mean.
Using URLs avoids ambiguity
and also reuses an existing mechanism for giving out names,
making it simpler and easier to coordinate with other programmers.
<p>
Continuing the example,
Go import paths are written in Go source files,
not in a separate build configuration file.
This makes Go source files self-contained,
which makes it easier to understand, modify, and copy them.
These decisions, and more, were all made with the goal of
simplifying software engineering.
<p>
In later posts I will talk specifically about why
versions are important for software engineering
and how software engineering concerns motivate
the design changes from dep to vgo.
Go and Dogma
tag:research.swtch.com,2012:research.swtch.com/dogma
2017-01-09T09:00:00-05:00
2017-01-09T09:02:00-05:00
Programming language dogmatics.
<p>
[<i>Cross-posting from last year’s <a href="https://www.reddit.com/r/golang/comments/46bd5h/ama_we_are_the_go_contributors_ask_us_anything/d05yyde/?context=3&st=ixq5hjko&sh=7affd469">Go contributors AMA</a> on Reddit, because it’s still important to remember.</i>]
<p>
One of the perks of working on Go these past years has been the chance to have many great discussions with other language designers and implementers, for example about how well various design decisions worked out or the common problems of implementing what look like very different languages (for example both Go and Haskell need some kind of “green threads”, so there are more shared runtime challenges than you might expect). In one such conversation, when I was talking to a group of early Lisp hackers, one of them pointed out that these discussions are basically never dogmatic. Designers and implementers remember working through the good arguments on both sides of a particular decision, and they’re often eager to hear about someone else’s experience with what happens when you make that decision differently. Contrast that kind of discussion with the heated arguments or overly zealous statements you sometimes see from users of the same languages. There’s a real disconnect, possibly because the users don’t have the experience of weighing the arguments on both sides and don’t realize how easily a particular decision might have gone the other way.
<p>
Language design and implementation is engineering. We make decisions using evaluations of costs and benefits or, if we must, using predictions of those based on past experience. I think we have an important responsibility to explain both sides of a particular decision, to make clear that the arguments for an alternate decision are actually good ones that we weighed and balanced, and to avoid the suggestion that particular design decisions approach dogma. I hope <a href="https://www.reddit.com/r/golang/comments/46bd5h/ama_we_are_the_go_contributors_ask_us_anything/d05yyde/?context=3&st=ixq5hjko&sh=7affd469">the Reddit AMA</a> as well as discussion on <a href="https://groups.google.com/group/golang-nuts">golang-nuts</a> or <a href="http://stackoverflow.com/questions/tagged/go">StackOverflow</a> or the <a href="https://forum.golangbridge.org/">Go Forum</a> or at <a href="https://golang.org/wiki/Conferences">conferences</a> help with that.
<p>
But we need help from everyone. Remember that none of the decisions in Go are infallible; they’re just our best attempts at the time we made them, not wisdom received on stone tablets. If someone asks why Go does X instead of Y, please try to present the engineering reasons fairly, including for Y, and avoid argument solely by appeal to authority. It’s too easy to fall into the “well that’s just not how it’s done here” trap. And now that I know about and watch for that trap, I see it in nearly every technical community, although some more than others.
A Tour of Acme
tag:research.swtch.com,2012:research.swtch.com/acme
2012-09-17T11:00:00-04:00
2012-09-17T11:00:00-04:00
A video introduction to Acme, the Plan 9 text editor
<p class="lp">
People I work with recognize my computer easily:
it's the one with nothing but yellow windows and blue bars on the screen.
That's the text editor acme, written by Rob Pike for Plan 9 in the early 1990s.
Acme focuses entirely on the idea of text as user interface.
It's difficult to explain acme without seeing it, though, so I've put together
a screencast explaining the basics of acme and showing a brief programming session.
Remember as you watch the video that the 854x480 screen is quite cramped.
Usually you'd run acme on a larger screen: even my MacBook Air has almost four times
as much screen real estate.
</p>
<center>
<div style="border: 1px solid black; width: 853px; height: 480px;"><iframe width="853" height="480" src="https://www.youtube.com/embed/dP1xVpMPn8M?rel=0" frameborder="0" allowfullscreen></iframe></div>
</center>
<p class=pp>
The video doesn't show everything acme can do, nor does it show all the ways you can use it.
Even small idioms like where you type text to be loaded or executed vary from user to user.
To learn more about acme, read Rob Pike's paper “<a href="/acme.pdf">Acme: A User Interface for Programmers</a>” and then try it.
</p>
<p class=pp>
Acme runs on most operating systems.
If you use <a href="http://plan9.bell-labs.com/plan9/">Plan 9 from Bell Labs</a>, you already have it.
If you use FreeBSD, Linux, OS X, or most other Unix clones, you can get it as part of <a href="http://swtch.com/plan9port/">Plan 9 from User Space</a>.
If you use Windows, I suggest trying acme as packaged in <a href="http://code.google.com/p/acme-sac/">acme stand alone complex</a>, which is based on the Inferno programming environment.
</p>
<p class=lp><b>Mini-FAQ</b>:
<ul>
<li><i>Q. Can I use scalable fonts?</i> A. On the Mac, yes. If you run <code>acme -f /mnt/font/Monaco/16a/font</code> you get 16-point anti-aliased Monaco as your font, served via <a href="http://swtch.com/plan9port/man/man4/fontsrv.html">fontsrv</a>. If you'd like to add X11 support to fontsrv, I'd be happy to apply the patch.
<li><i>Q. Do I need X11 to build on the Mac?</i> A. No. The build will complain that it cannot build ‘snarfer’ but it should complete otherwise. You probably don't need snarfer.
</ul>
<p class=pp>
If you're interested in history, the predecessor to acme was called help. Rob Pike's paper “<a href="/help.pdf">A Minimalist Global User Interface</a>” describes it. See also “<a href="/sam.pdf">The Text Editor sam</a>”
</p>
<p class=pp>
<i>Correction</i>: the smiley program in the video was written by Ken Thompson.
I got it from Dennis Ritchie, the more meticulous archivist of the pair.
</p>
Minimal Boolean Formulas
tag:research.swtch.com,2012:research.swtch.com/boolean
2011-05-18T00:00:00-04:00
2011-05-18T00:00:00-04:00
Simplify equations with God
<p><style type="text/css">
p { line-height: 150%; }
blockquote { text-align: left; }
pre.alg { font-family: sans-serif; font-size: 100%; margin-left: 60px; }
td, th { padding-left; 5px; padding-right: 5px; vertical-align: top; }
#times td { text-align: right; }
table { padding-top: 1em; padding-bottom: 1em; }
#find td { text-align: center; }
</style>
<p class=lp>
<a href="http://oeis.org/A056287">28</a>.
That's the minimum number of AND or OR operators
you need in order to write any Boolean function of five variables.
<a href="http://alexhealy.net/">Alex Healy</a> and I computed that in April 2010. Until then,
I believe no one had ever known that little fact.
This post describes how we computed it
and how we almost got scooped by <a href="http://research.swtch.com/2011/01/knuth-volume-4a.html">Knuth's Volume 4A</a>
which considers the problem for AND, OR, and XOR.
</p>
<h3>A Naive Brute Force Approach</h3>
<p class=pp>
Any Boolean function of two variables
can be written with at most 3 AND or OR operators: the parity function
on two variables X XOR Y is (X AND Y') OR (X' AND Y), where X' denotes
“not X.” We can shorten the notation by writing AND and OR
like multiplication and addition: X XOR Y = X*Y' + X'*Y.
</p>
<p class=pp>
For three variables, parity is also a hardest function, requiring 9 operators:
X XOR Y XOR Z = (X*Z'+X'*Z+Y')*(X*Z+X'*Z'+Y).
</p>
<p class=pp>
For four variables, parity is still a hardest function, requiring 15 operators:
W XOR X XOR Y XOR Z = (X*Z'+X'*Z+W'*Y+W*Y')*(X*Z+X'*Z'+W*Y+W'*Y').
</p>
<p class=pp>
The sequence so far prompts a few questions. Is parity always a hardest function?
Does the minimum number of operators alternate between 2<sup>n</sup>−1 and 2<sup>n</sup>+1?
</p>
<p class=pp>
I computed these results in January 2001 after hearing
the problem from Neil Sloane, who suggested it as a variant
of a similar problem first studied by Claude Shannon.
</p>
<p class=pp>
The program I wrote to compute a(4) computes the minimum number of
operators for every Boolean function of n variables
in order to find the largest minimum over all functions.
There are 2<sup>4</sup> = 16 settings of four variables, and each function
can pick its own value for each setting, so there are 2<sup>16</sup> different
functions. To make matters worse, you build new functions
by taking pairs of old functions and joining them with AND or OR.
2<sup>16</sup> different functions means 2<sup>16</sup>·2<sup>16</sup> = 2<sup>32</sup> pairs of functions.
</p>
<p class=pp>
The program I wrote was a mangling of the Floyd-Warshall
all-pairs shortest paths algorithm. That algorithm is:
</p>
<pre class="indent alg">
// Floyd-Warshall all pairs shortest path
func compute():
for each node i
for each node j
dist[i][j] = direct distance, or ∞
for each node k
for each node i
for each node j
d = dist[i][k] + dist[k][j]
if d < dist[i][j]
dist[i][j] = d
return
</pre>
<p class=lp>
The algorithm begins with the distance table dist[i][j] set to
an actual distance if i is connected to j and infinity otherwise.
Then each round updates the table to account for paths
going through the node k: if it's shorter to go from i to k to j,
it saves that shorter distance in the table. The nodes are
numbered from 0 to n, so the variables i, j, k are simply integers.
Because there are only n nodes, we know we'll be done after
the outer loop finishes.
</p>
<p class=pp>
The program I wrote to find minimum Boolean formula sizes is
an adaptation, substituting formula sizes for distance.
</p>
<pre class="indent alg">
// Algorithm 1
func compute()
for each function f
size[f] = ∞
for each single variable function f = v
size[f] = 0
loop
changed = false
for each function f
for each function g
d = size[f] + 1 + size[g]
if d < size[f OR g]
size[f OR g] = d
changed = true
if d < size[f AND g]
size[f AND g] = d
changed = true
if not changed
return
</pre>
<p class=lp>
Algorithm 1 runs the same kind of iterative update loop as the Floyd-Warshall algorithm,
but it isn't as obvious when you can stop, because you don't
know the maximum formula size beforehand.
So it runs until a round doesn't find any new functions to make,
iterating until it finds a fixed point.
</p>
<p class=pp>
The pseudocode above glosses over some details, such as
the fact that the per-function loops can iterate over a
queue of functions known to have finite size, so that each
loop omits the functions that aren't
yet known. That's only a constant factor improvement,
but it's a useful one.
</p>
<p class=pp>
Another important detail missing above
is the representation of functions. The most convenient
representation is a binary truth table.
For example,
if we are computing the complexity of two-variable functions,
there are four possible inputs, which we can number as follows.
</p>
<center>
<table>
<tr><th>X <th>Y <th>Value
<tr><td>false <td>false <td>00<sub>2</sub> = 0
<tr><td>false <td>true <td>01<sub>2</sub> = 1
<tr><td>true <td>false <td>10<sub>2</sub> = 2
<tr><td>true <td>true <td>11<sub>2</sub> = 3
</table>
</center>
<p class=pp>
The functions are then the 4-bit numbers giving the value of the
function for each input. For example, function 13 = 1101<sub>2</sub>
is true for all inputs except X=false Y=true.
Three-variable functions correspond to 3-bit inputs generating 8-bit truth tables,
and so on.
</p>
<p class=pp>
This representation has two key advantages. The first is that
the numbering is dense, so that you can implement a map keyed
by function using a simple array. The second is that the operations
“f AND g” and “f OR g” can be implemented using
bitwise operators: the truth table for “f AND g” is the bitwise
AND of the truth tables for f and g.
</p>
<p class=pp>
That program worked well enough in 2001 to compute the
minimum number of operators necessary to write any
1-, 2-, 3-, and 4-variable Boolean function. Each round
takes asymptotically O(2<sup>2<sup>n</sup></sup>·2<sup>2<sup>n</sup></sup>) = O(2<sup>2<sup>n+1</sup></sup>) time, and the number of
rounds needed is O(the final answer). The answer for n=4
is 15, so the computation required on the order of
15·2<sup>2<sup>5</sup></sup> = 15·2<sup>32</sup> iterations of the innermost loop.
That was plausible on the computer I was using at
the time, but the answer for n=5, likely around 30,
would need 30·2<sup>64</sup> iterations to compute, which
seemed well out of reach.
At the time, it seemed plausible that parity was always
a hardest function and that the minimum size would
continue to alternate between 2<sup>n</sup>−1 and 2<sup>n</sup>+1.
It's a nice pattern.
</p>
<h3>Exploiting Symmetry</h3>
<p class=pp>
Five years later, though, Alex Healy and I got to talking about this sequence,
and Alex shot down both conjectures using results from the theory
of circuit complexity. (Theorists!) Neil Sloane added this note to
the <a href="http://oeis.org/history?seq=A056287">entry for the sequence</a> in his Online Encyclopedia of Integer Sequences:
</p>
<blockquote>
<tt>
%E A056287 Russ Cox conjectures that X<sub>1</sub> XOR ... XOR X<sub>n</sub> is always a worst f and that a(5) = 33 and a(6) = 63. But (Jan 27 2006) Alex Healy points out that this conjecture is definitely false for large n. So what is a(5)?
</tt>
</blockquote>
<p class=lp>
Indeed. What is a(5)? No one knew, and it wasn't obvious how to find out.
</p>
<p class=pp>
In January 2010, Alex and I started looking into ways to
speed up the computation for a(5). 30·2<sup>64</sup> is too many
iterations but maybe we could find ways to cut that number.
</p>
<p class=pp>
In general, if we can identify a class of functions f whose
members are guaranteed to have the same complexity,
then we can save just one representative of the class as
long as we recreate the entire class in the loop body.
What used to be:
</p>
<pre class="indent alg">
for each function f
for each function g
visit f AND g
visit f OR g
</pre>
<p class=lp>
can be rewritten as
</p>
<pre class="indent alg">
for each canonical function f
for each canonical function g
for each ff equivalent to f
for each gg equivalent to g
visit ff AND gg
visit ff OR gg
</pre>
<p class=lp>
That doesn't look like an improvement: it's doing all
the same work. But it can open the door to new optimizations
depending on the equivalences chosen.
For example, the functions “f” and “¬f” are guaranteed
to have the same complexity, by <a href="http://en.wikipedia.org/wiki/De_Morgan's_laws">DeMorgan's laws</a>.
If we keep just one of
those two on the lists that “for each function” iterates over,
we can unroll the inner two loops, producing:
</p>
<pre class="indent alg">
for each canonical function f
for each canonical function g
visit f OR g
visit f AND g
visit ¬f OR g
visit ¬f AND g
visit f OR ¬g
visit f AND ¬g
visit ¬f OR ¬g
visit ¬f AND ¬g
</pre>
<p class=lp>
That's still not an improvement, but it's no worse.
Each of the two loops considers half as many functions
but the inner iteration is four times longer.
Now we can notice that half of tests aren't
worth doing: “f AND g” is the negation of
“¬f OR ¬g,” and so on, so only half
of them are necessary.
</p>
<p class=pp>
Let's suppose that when choosing between “f” and “¬f”
we keep the one that is false when presented with all true inputs.
(This has the nice property that <code>f ^ (int32(f) >> 31)</code>
is the truth table for the canonical form of <code>f</code>.)
Then we can tell which combinations above will produce
canonical functions when f and g are already canonical:
</p>
<pre class="indent alg">
for each canonical function f
for each canonical function g
visit f OR g
visit f AND g
visit ¬f AND g
visit f AND ¬g
</pre>
<p class=lp>
That's a factor of two improvement over the original loop.
</p>
<p class=pp>
Another observation is that permuting
the inputs to a function doesn't change its complexity:
“f(V, W, X, Y, Z)” and “f(Z, Y, X, W, V)” will have the same
minimum size. For complex functions, each of the
5! = 120 permutations will produce a different truth table.
A factor of 120 reduction in storage is good but again
we have the problem of expanding the class in the
iteration. This time, there's a different trick for reducing
the work in the innermost iteration.
Since we only need to produce one member of
the equivalence class, it doesn't make sense to
permute the inputs to both f and g. Instead,
permuting just the inputs to f while fixing g
is guaranteed to hit at least one member
of each class that permuting both f and g would.
So we gain the factor of 120 twice in the loops
and lose it once in the iteration, for a net savings
of 120.
(In some ways, this is the same trick we did with “f” vs “¬f.”)
</p>
<p class=pp>
A final observation is that negating any of the inputs
to the function doesn't change its complexity,
because X and X' have the same complexity.
The same argument we used for permutations applies
here, for another constant factor of 2<sup>5</sup> = 32.
</p>
<p class=pp>
The code stores a single function for each equivalence class
and then recomputes the equivalent functions for f, but not g.
</p>
<pre class="indent alg">
for each canonical function f
for each function ff equivalent to f
for each canonical function g
visit ff OR g
visit ff AND g
visit ¬ff AND g
visit ff AND ¬g
</pre>
<p class=lp>
In all, we just got a savings of 2·120·32 = 7680,
cutting the total number of iterations from 30·2<sup>64</sup> = 5×10<sup>20</sup>
to 7×10<sup>16</sup>. If you figure we can do around
10<sup>9</sup> iterations per second, that's still 800 days of CPU time.
</p>
<p class=pp>
The full algorithm at this point is:
</p>
<pre class="indent alg">
// Algorithm 2
func compute():
for each function f
size[f] = ∞
for each single variable function f = v
size[f] = 0
loop
changed = false
for each canonical function f
for each function ff equivalent to f
for each canonical function g
d = size[ff] + 1 + size[g]
changed |= visit(d, ff OR g)
changed |= visit(d, ff AND g)
changed |= visit(d, ff AND ¬g)
changed |= visit(d, ¬ff AND g)
if not changed
return
func visit(d, fg):
if size[fg] != ∞
return false
record fg as canonical
for each function ffgg equivalent to fg
size[ffgg] = d
return true
</pre>
<p class=lp>
The helper function “visit” must set the size not only of its argument fg
but also all equivalent functions under permutation or inversion of the inputs,
so that future tests will see that they have been computed.
</p>
<h3>Methodical Exploration</h3>
<p class=pp>
There's one final improvement we can make.
The approach of looping until things stop changing
considers each function pair multiple times
as their sizes go down. Instead, we can consider functions
in order of complexity, so that the main loop
builds first all the functions of minimum complexity 1,
then all the functions of minimum complexity 2,
and so on. If we do that, we'll consider each function pair at most once.
We can stop when all functions are accounted for.
</p>
<p class=pp>
Applying this idea to Algorithm 1 (before canonicalization) yields:
</p>
<pre class="indent alg">
// Algorithm 3
func compute()
for each function f
size[f] = ∞
for each single variable function f = v
size[f] = 0
for k = 1 to ∞
for each function f
for each function g of size k − size(f) − 1
if size[f AND g] == ∞
size[f AND g] = k
nsize++
if size[f OR g] == ∞
size[f OR g] = k
nsize++
if nsize == 2<sup>2<sup>n</sup></sup>
return
</pre>
<p class=lp>
Applying the idea to Algorithm 2 (after canonicalization) yields:
</p>
<pre class="indent alg">
// Algorithm 4
func compute():
for each function f
size[f] = ∞
for each single variable function f = v
size[f] = 0
for k = 1 to ∞
for each canonical function f
for each function ff equivalent to f
for each canonical function g of size k − size(f) − 1
visit(k, ff OR g)
visit(k, ff AND g)
visit(k, ff AND ¬g)
visit(k, ¬ff AND g)
if nvisited == 2<sup>2<sup>n</sup></sup>
return
func visit(d, fg):
if size[fg] != ∞
return
record fg as canonical
for each function ffgg equivalent to fg
if size[ffgg] != ∞
size[ffgg] = d
nvisited += 2 // counts ffgg and ¬ffgg
return
</pre>
<p class=lp>
The original loop in Algorithms 1 and 2 considered each pair f, g in every
iteration of the loop after they were computed.
The new loop in Algorithms 3 and 4 considers each pair f, g only once,
when k = size(f) + size(g) + 1. This removes the
leading factor of 30 (the number of times we
expected the first loop to run) from our estimation
of the run time.
Now the expected number of iterations is around
2<sup>64</sup>/7680 = 2.4×10<sup>15</sup>. If we can do 10<sup>9</sup> iterations
per second, that's only 28 days of CPU time,
which I can deliver if you can wait a month.
</p>
<p class=pp>
Our estimate does not include the fact that not all function pairs need
to be considered. For example, if the maximum size is 30, then the
functions of size 14 need never be paired against the functions of size 16,
because any result would have size 14+1+16 = 31.
So even 2.4×10<sup>15</sup> is an overestimate, but it's in the right ballpark.
(With hindsight I can report that only 1.7×10<sup>14</sup> pairs need to be considered
but also that our estimate of 10<sup>9</sup> iterations
per second was optimistic. The actual calculation ran for 20 days,
an average of about 10<sup>8</sup> iterations per second.)
</p>
<h3>Endgame: Directed Search</h3>
<p class=pp>
A month is still a long time to wait, and we can do better.
Near the end (after k is bigger than, say, 22), we are exploring
the fairly large space of function pairs in hopes of finding a
fairly small number of remaining functions.
At that point it makes sense to change from the
bottom-up “bang things together and see what we make”
to the top-down “try to make this one of these specific functions.”
That is, the core of the current search is:
</p>
<pre class="indent alg">
for each canonical function f
for each function ff equivalent to f
for each canonical function g of size k − size(f) − 1
visit(k, ff OR g)
visit(k, ff AND g)
visit(k, ff AND ¬g)
visit(k, ¬ff AND g)
</pre>
<p class=lp>
We can change it to:
</p>
<pre class="indent alg">
for each missing function fg
for each canonical function g
for all possible f such that one of these holds
* fg = f OR g
* fg = f AND g
* fg = ¬f AND g
* fg = f AND ¬g
if size[f] == k − size(g) − 1
visit(k, fg)
next fg
</pre>
<p class=lp>
By the time we're at the end, exploring all the possible f to make
the missing functions—a directed search—is much less work than
the brute force of exploring all combinations.
</p>
<p class=pp>
As an example, suppose we are looking for f such that fg = f OR g.
The equation is only possible to satisfy if fg OR g == fg.
That is, if g has any extraneous 1 bits, no f will work, so we can move on.
Otherwise, the remaining condition is that
f AND ¬g == fg AND ¬g. That is, for the bit positions where g is 0, f must match fg.
The other bits of f (the bits where g has 1s)
can take any value.
We can enumerate the possible f values by recursively trying all
possible values for the “don't care” bits.
</p>
<pre class="indent alg">
func find(x, any, xsize):
if size(x) == xsize
return x
while any != 0
bit = any AND −any // rightmost 1 bit in any
any = any AND ¬bit
if f = find(x OR bit, any, xsize) succeeds
return f
return failure
</pre>
<p class=lp>
It doesn't matter which 1 bit we choose for the recursion,
but finding the rightmost 1 bit is cheap: it is isolated by the
(admittedly surprising) expression “any AND −any.”
</p>
<p class=pp>
Given <code>find</code>, the loop above can try these four cases:
</p>
<center>
<table id=find>
<tr><th>Formula <th>Condition <th>Base x <th>“Any” bits
<tr><td>fg = f OR g <td>fg OR g == fg <td>fg AND ¬g <td>g
<tr><td>fg = f OR ¬g <td>fg OR ¬g == fg <td>fg AND g <td>¬g
<tr><td>¬fg = f OR g <td>¬fg OR g == fg <td>¬fg AND ¬g <td>g
<tr><td>¬fg = f OR ¬g <td>¬fg OR ¬g == ¬fg <td>¬fg AND g <td>¬g
</table>
</center>
<p class=lp>
Rewriting the Boolean expressions to use only the four OR forms
means that we only need to write the “adding bits” version of find.
</p>
<p class=pp>
The final algorithm is:
</p>
<pre class="indent alg">
// Algorithm 5
func compute():
for each function f
size[f] = ∞
for each single variable function f = v
size[f] = 0
// Generate functions.
for k = 1 to max_generate
for each canonical function f
for each function ff equivalent to f
for each canonical function g of size k − size(f) − 1
visit(k, ff OR g)
visit(k, ff AND g)
visit(k, ff AND ¬g)
visit(k, ¬ff AND g)
// Search for functions.
for k = max_generate+1 to ∞
for each missing function fg
for each canonical function g
fsize = k − size(g) − 1
if fg OR g == fg
if f = find(fg AND ¬g, g, fsize) succeeds
visit(k, fg)
next fg
if fg OR ¬g == fg
if f = find(fg AND g, ¬g, fsize) succeeds
visit(k, fg)
next fg
if ¬fg OR g == ¬fg
if f = find(¬fg AND ¬g, g, fsize) succeeds
visit(k, fg)
next fg
if ¬fg OR ¬g == ¬fg
if f = find(¬fg AND g, ¬g, fsize) succeeds
visit(k, fg)
next fg
if nvisited == 2<sup>2<sup>n</sup></sup>
return
func visit(d, fg):
if size[fg] != ∞
return
record fg as canonical
for each function ffgg equivalent to fg
if size[ffgg] != ∞
size[ffgg] = d
nvisited += 2 // counts ffgg and ¬ffgg
return
func find(x, any, xsize):
if size(x) == xsize
return x
while any != 0
bit = any AND −any // rightmost 1 bit in any
any = any AND ¬bit
if f = find(x OR bit, any, xsize) succeeds
return f
return failure
</pre>
<p class=lp>
To get a sense of the speedup here, and to check my work,
I ran the program using both algorithms
on a 2.53 GHz Intel Core 2 Duo E7200.
</p>
<center>
<table id=times>
<tr><th> <th colspan=3>————— # of Functions —————<th colspan=2>———— Time ————
<tr><th>Size <th>Canonical <th>All <th>All, Cumulative <th>Generate <th>Search
<tr><td>0 <td>1 <td>10 <td>10
<tr><td>1 <td>2 <td>82 <td>92 <td>< 0.1 seconds <td>3.4 minutes
<tr><td>2 <td>2 <td>640 <td>732 <td>< 0.1 seconds <td>7.2 minutes
<tr><td>3 <td>7 <td>4420 <td>5152 <td>< 0.1 seconds <td>12.3 minutes
<tr><td>4 <td>19 <td>25276 <td>29696 <td>< 0.1 seconds <td>30.1 minutes
<tr><td>5 <td>44 <td>117440 <td>147136 <td>< 0.1 seconds <td>1.3 hours
<tr><td>6 <td>142 <td>515040 <td>662176 <td>< 0.1 seconds <td>3.5 hours
<tr><td>7 <td>436 <td>1999608 <td>2661784 <td>0.2 seconds <td>11.6 hours
<tr><td>8 <td>1209 <td>6598400 <td>9260184 <td>0.6 seconds <td>1.7 days
<tr><td>9 <td>3307 <td>19577332 <td>28837516 <td>1.7 seconds <td>4.9 days
<tr><td>10 <td>7741 <td>50822560 <td>79660076 <td>4.6 seconds <td>[ 10 days ? ]
<tr><td>11 <td>17257 <td>114619264 <td>194279340 <td>10.8 seconds <td>[ 20 days ? ]
<tr><td>12 <td>31851 <td>221301008 <td>415580348 <td>21.7 seconds <td>[ 50 days ? ]
<tr><td>13 <td>53901 <td>374704776 <td>790285124 <td>38.5 seconds <td>[ 80 days ? ]
<tr><td>14 <td>75248 <td>533594528 <td>1323879652 <td>58.7 seconds <td>[ 100 days ? ]
<tr><td>15 <td>94572 <td>667653642 <td>1991533294 <td>1.5 minutes <td>[ 120 days ? ]
<tr><td>16 <td>98237 <td>697228760 <td>2688762054 <td>2.1 minutes <td>[ 120 days ? ]
<tr><td>17 <td>89342 <td>628589440 <td>3317351494 <td>4.1 minutes <td>[ 90 days ? ]
<tr><td>18 <td>66951 <td>468552896 <td>3785904390 <td>9.1 minutes <td>[ 50 days ? ]
<tr><td>19 <td>41664 <td>287647616 <td>4073552006 <td>23.4 minutes <td>[ 30 days ? ]
<tr><td>20 <td>21481 <td>144079832 <td>4217631838 <td>57.0 minutes <td>[ 10 days ? ]
<tr><td>21 <td>8680 <td>55538224 <td>4273170062 <td>2.4 hours <td>2.5 days
<tr><td>22 <td>2730 <td>16099568 <td>4289269630 <td>5.2 hours <td>11.7 hours
<tr><td>23 <td>937 <td>4428800 <td>4293698430 <td>11.2 hours <td>2.2 hours
<tr><td>24 <td>228 <td>959328 <td>4294657758 <td>22.0 hours <td>33.2 minutes
<tr><td>25 <td>103 <td>283200 <td>4294940958 <td>1.7 days <td>4.0 minutes
<tr><td>26 <td>21 <td>22224 <td>4294963182 <td>2.9 days <td>42 seconds
<tr><td>27 <td>10 <td>3602 <td>4294966784 <td>4.7 days <td>2.4 seconds
<tr><td>28 <td>3 <td>512 <td>4294967296 <td>[ 7 days ? ] <td>0.1 seconds
</table>
</center>
<p class=pp>
The bracketed times are estimates based on the work involved: I did not
wait that long for the intermediate search steps.
The search algorithm is quite a bit worse than generate until there are
very few functions left to find.
However, it comes in handy just when it is most useful: when the
generate algorithm has slowed to a crawl.
If we run generate through formulas of size 22 and then switch
to search for 23 onward, we can run the whole computation in
just over half a day of CPU time.
</p>
<p class=pp>
The computation of a(5) identified the sizes of all 616,126
canonical Boolean functions of 5 inputs.
In contrast, there are <a href="http://oeis.org/A000370">just over 200 trillion canonical Boolean functions of 6 inputs</a>.
Determining a(6) is unlikely to happen by brute force computation, no matter what clever tricks we use.
</p>
<h3>Adding XOR</h3>
<p class=pp>We've assumed the use of just AND and OR as our
basis for the Boolean formulas. If we also allow XOR, functions
can be written using many fewer operators.
In particular, a hardest function for the 1-, 2-, 3-, and 4-input
cases—parity—is now trivial.
Knuth examines the complexity of 5-input Boolean functions
using AND, OR, and XOR in detail in <a href="http://www-cs-faculty.stanford.edu/~uno/taocp.html">The Art of Computer Programming, Volume 4A</a>.
Section 7.1.2's Algorithm L is the same as our Algorithm 3 above,
given for computing 4-input functions.
Knuth mentions that to adapt it for 5-input functions one must
treat only canonical functions and gives results for 5-input functions
with XOR allowed.
So another way to check our work is to add XOR to our Algorithm 4
and check that our results match Knuth's.
</p>
<p class=pp>
Because the minimum formula sizes are smaller (at most 12), the
computation of sizes with XOR is much faster than before:
</p>
<center>
<table>
<tr><th> <th><th colspan=5>————— # of Functions —————<th>
<tr><th>Size <th width=10><th>Canonical <th width=10><th>All <th width=10><th>All, Cumulative <th width=10><th>Time
<tr><td align=right>0 <td><td align=right>1 <td><td align=right>10 <td><td align=right>10 <td><td>
<tr><td align=right>1 <td><td align=right>3 <td><td align=right>102 <td><td align=right>112 <td><td align=right>< 0.1 seconds
<tr><td align=right>2 <td><td align=right>5 <td><td align=right>1140 <td><td align=right>1252 <td><td align=right>< 0.1 seconds
<tr><td align=right>3 <td><td align=right>20 <td><td align=right>11570 <td><td align=right>12822 <td><td align=right>< 0.1 seconds
<tr><td align=right>4 <td><td align=right>93 <td><td align=right>109826 <td><td align=right>122648 <td><td align=right>< 0.1 seconds
<tr><td align=right>5 <td><td align=right>366 <td><td align=right>936440 <td><td align=right>1059088 <td><td align=right>0.1 seconds
<tr><td align=right>6 <td><td align=right>1730 <td><td align=right>7236880 <td><td align=right>8295968 <td><td align=right>0.7 seconds
<tr><td align=right>7 <td><td align=right>8782 <td><td align=right>47739088 <td><td align=right>56035056 <td><td align=right>4.5 seconds
<tr><td align=right>8 <td><td align=right>40297 <td><td align=right>250674320 <td><td align=right>306709376 <td><td align=right>24.0 seconds
<tr><td align=right>9 <td><td align=right>141422 <td><td align=right>955812256 <td><td align=right>1262521632 <td><td align=right>95.5 seconds
<tr><td align=right>10 <td><td align=right>273277 <td><td align=right>1945383936 <td><td align=right>3207905568 <td><td align=right>200.7 seconds
<tr><td align=right>11 <td><td align=right>145707 <td><td align=right>1055912608 <td><td align=right>4263818176 <td><td align=right>121.2 seconds
<tr><td align=right>12 <td><td align=right>4423 <td><td align=right>31149120 <td><td align=right>4294967296 <td><td align=right>65.0 seconds
</table>
</center>
<p class=pp>
Knuth does not discuss anything like Algorithm 5,
because the search for specific functions does not apply to
the AND, OR, and XOR basis. XOR is a non-monotone
function (it can both turn bits on and turn bits off), so
there is no test like our “<code>if fg OR g == fg</code>”
and no small set of “don't care” bits to trim the search for f.
The search for an appropriate f in the XOR case would have
to try all f of the right size, which is exactly what Algorithm 4 already does.
</p>
<p class=pp>
Volume 4A also considers the problem of building minimal circuits,
which are like formulas but can use common subexpressions additional times for free,
and the problem of building the shallowest possible circuits.
See Section 7.1.2 for all the details.
</p>
<h3>Code and Web Site</h3>
<p class=pp>
The web site <a href="http://boolean-oracle.swtch.com">boolean-oracle.swtch.com</a>
lets you type in a Boolean expression and gives back the minimal formula for it.
It uses tables generated while running Algorithm 5; those tables and the
programs described in this post are also <a href="http://boolean-oracle.swtch.com/about">available on the site</a>.
</p>
<h3>Postscript: Generating All Permutations and Inversions</h3>
<p class=pp>
The algorithms given above depend crucially on the step
“<code>for each function ff equivalent to f</code>,”
which generates all the ff obtained by permuting or inverting inputs to f,
but I did not explain how to do that.
We already saw that we can manipulate the binary truth table representation
directly to turn <code>f</code> into <code>¬f</code> and to compute
combinations of functions.
We can also manipulate the binary representation directly to
invert a specific input or swap a pair of adjacent inputs.
Using those operations we can cycle through all the equivalent functions.
</p>
<p class=pp>
To invert a specific input,
let's consider the structure of the truth table.
The index of a bit in the truth table encodes the inputs for that entry.
For example, the low bit of the index gives the value of the first input.
So the even-numbered bits—at indices 0, 2, 4, 6, ...—correspond to
the first input being false, while the odd-numbered bits—at indices 1, 3, 5, 7, ...—correspond
to the first input being true.
Changing just that bit in the index corresponds to changing the
single variable, so indices 0, 1 differ only in the value of the first input,
as do 2, 3, and 4, 5, and 6, 7, and so on.
Given the truth table for f(V, W, X, Y, Z) we can compute
the truth table for f(¬V, W, X, Y, Z) by swapping adjacent bit pairs
in the original truth table.
Even better, we can do all the swaps in parallel using a bitwise
operation.
To invert a different input, we swap larger runs of bits.
</p>
<center>
<table>
<tr><th>Function <th width=10> <th>Truth Table (<span style="font-weight: normal;"><code>f</code> = f(V, W, X, Y, Z)</span>)
<tr><td>f(¬V, W, X, Y, Z) <td><td><code>(f&0x55555555)<< 1 | (f>> 1)&0x55555555</code>
<tr><td>f(V, ¬W, X, Y, Z) <td><td><code>(f&0x33333333)<< 2 | (f>> 2)&0x33333333</code>
<tr><td>f(V, W, ¬X, Y, Z) <td><td><code>(f&0x0f0f0f0f)<< 4 | (f>> 4)&0x0f0f0f0f</code>
<tr><td>f(V, W, X, ¬Y, Z) <td><td><code>(f&0x00ff00ff)<< 8 | (f>> 8)&0x00ff00ff</code>
<tr><td>f(V, W, X, Y, ¬Z) <td><td><code>(f&0x0000ffff)<<16 | (f>>16)&0x0000ffff</code>
</table>
</center>
<p class=lp>
Being able to invert a specific input lets us consider all possible
inversions by building them up one at a time.
The <a href="http://oeis.org/A003188">Gray code</a> lets us
enumerate all possible 5-bit input codes while changing only 1 bit at
a time as we move from one input to the next:
</p>
<center>
0, 1, 3, 2, 6, 7, 5, 4, <br>
12, 13, 15, 14, 10, 11, 9, 8, <br>
24, 25, 27, 26, 30, 31, 29, 28, <br>
20, 21, 23, 22, 18, 19, 17, 16
</center>
<p class=lp>
This minimizes
the number of inversions we need: to consider all 32 cases, we only
need 31 inversion operations.
In contrast, visiting the 5-bit input codes in the usual binary order 0, 1, 2, 3, 4, ...
would often need to change multiple bits, like when changing from 3 to 4.
</p>
<p class=pp>
To swap a pair of adjacent inputs, we can again take advantage of the truth table.
For a pair of inputs, there are four cases: 00, 01, 10, and 11. We can leave the
00 and 11 cases alone, because they are invariant under swapping,
and concentrate on swapping the 01 and 10 bits.
The first two inputs change most often in the truth table: each run of 4 bits
corresponds to those four cases.
In each run, we want to leave the first and fourth alone and swap the second and third.
For later inputs, the four cases consist of sections of bits instead of single bits.
</p>
<center>
<table>
<tr><th>Function <th width=10> <th>Truth Table (<span style="font-weight: normal;"><code>f</code> = f(V, W, X, Y, Z)</span>)
<tr><td>f(<b>W, V</b>, X, Y, Z) <td><td><code>f&0x99999999 | (f&0x22222222)<<1 | (f>>1)&0x22222222</code>
<tr><td>f(V, <b>X, W</b>, Y, Z) <td><td><code>f&0xc3c3c3c3 | (f&0x0c0c0c0c)<<1 | (f>>1)&0x0c0c0c0c</code>
<tr><td>f(V, W, <b>Y, X</b>, Z) <td><td><code>f&0xf00ff00f | (f&0x00f000f0)<<1 | (f>>1)&0x00f000f0</code>
<tr><td>f(V, W, X, <b>Z, Y</b>) <td><td><code>f&0xff0000ff | (f&0x0000ff00)<<8 | (f>>8)&0x0000ff00</code>
</table>
</center>
<p class=lp>
Being able to swap a pair of adjacent inputs lets us consider all
possible permutations by building them up one at a time.
Again it is convenient to have a way to visit all permutations by
applying only one swap at a time.
Here Volume 4A comes to the rescue.
Section 7.2.1.2 is titled “Generating All Permutations,” and Knuth delivers
many algorithms to do just that.
The most convenient for our purposes is Algorithm P, which
generates a sequence that considers all permutations exactly once
with only a single swap of adjacent inputs between steps.
Knuth calls it Algorithm P because it corresponds to the
“Plain changes” algorithm used by <a href="http://en.wikipedia.org/wiki/Change_ringing">bell ringers in 17th century England</a>
to ring a set of bells in all possible permutations.
The algorithm is described in a manuscript written around 1653!
</p>
<p class=pp>
We can examine all possible permutations and inversions by
nesting a loop over all permutations inside a loop over all inversions,
and in fact that's what my program does.
Knuth does one better, though: his Exercise 7.2.1.2-20
suggests that it is possible to build up all the possibilities
using only adjacent swaps and inversion of the first input.
Negating arbitrary inputs is not hard, though, and still does
minimal work, so the code sticks with Gray codes and Plain changes.
</p></p>
Zip Files All The Way Down
tag:research.swtch.com,2012:research.swtch.com/zip
2010-03-18T00:00:00-04:00
2010-03-18T00:00:00-04:00
Did you think it was turtles?
<p><p class=lp>
Stephen Hawking begins <i><a href="http://www.amazon.com/-/dp/0553380168">A Brief History of Time</a></i> with this story:
</p>
<blockquote>
<p class=pp>
A well-known scientist (some say it was Bertrand Russell) once gave a public lecture on astronomy. He described how the earth orbits around the sun and how the sun, in turn, orbits around the center of a vast collection of stars called our galaxy. At the end of the lecture, a little old lady at the back of the room got up and said: “What you have told us is rubbish. The world is really a flat plate supported on the back of a giant tortoise.” The scientist gave a superior smile before replying, “What is the tortoise standing on?” “You're very clever, young man, very clever,” said the old lady. “But it's turtles all the way down!”
</p>
</blockquote>
<p class=lp>
Scientists today are pretty sure that the universe is not actually turtles all the way down,
but we can create that kind of situation in other contexts.
For example, here we have <a href="http://www.youtube.com/watch?v=Y-gqMTt3IUg">video monitors all the way down</a>
and <a href="http://www.amazon.com/gp/customer-media/product-gallery/0387900926/ref=cm_ciu_pdp_images_all">set theory books all the way down</a>,
and <a href="http://blog.makezine.com/archive/2009/01/thousands_of_shopping_carts_stake_o.html">shopping carts all the way down</a>.
</p>
<p class=pp>
And here's a computer storage equivalent:
look inside <a href="http://swtch.com/r.zip"><code>r.zip</code></a>.
It's zip files all the way down:
each one contains another zip file under the name <code>r/r.zip</code>.
(For the die-hard Unix fans, <a href="http://swtch.com/r.tar.gz"><code>r.tar.gz</code></a> is
gzipped tar files all the way down.)
Like the line of shopping carts, it never ends,
because it loops back onto itself: the zip file contains itself!
And it's probably less work to put together a self-reproducing zip file
than to put together all those shopping carts,
at least if you're the kind of person who would read this blog.
This post explains how.
</p>
<p class=pp>
Before we get to self-reproducing zip files, though,
we need to take a brief detour into self-reproducing programs.
</p>
<h3>Self-reproducing programs</h3>
<p class=pp>
The idea of self-reproducing programs dates back to the 1960s.
My favorite statement of the problem is the one Ken Thompson gave in his 1983 Turing Award address:
</p>
<blockquote>
<p class=pp>
In college, before video games, we would amuse ourselves by posing programming exercises. One of the favorites was to write the shortest self-reproducing program. Since this is an exercise divorced from reality, the usual vehicle was FORTRAN. Actually, FORTRAN was the language of choice for the same reason that three-legged races are popular.
</p>
<p class=pp>
More precisely stated, the problem is to write a source program that, when compiled and executed, will produce as output an exact copy of its source. If you have never done this, I urge you to try it on your own. The discovery of how to do it is a revelation that far surpasses any benefit obtained by being told how to do it. The part about “shortest” was just an incentive to demonstrate skill and determine a winner.
</p>
</blockquote>
<p class=lp>
<b>Spoiler alert!</b>
I agree: if you have never done this, I urge you to try it on your own.
The internet makes it so easy to look things up that it's refreshing
to discover something yourself once in a while.
Go ahead and spend a few days figuring out. This blog will still be here
when you get back.
(If you don't mind the spoilers, the entire <a href="http://cm.bell-labs.com/who/ken/trust.html">Turing award address</a> is worth reading.)
</p>
<center>
<br><br>
<i>(Spoiler blocker.)</i>
<br>
<a href="http://www.robertwechsler.com/projects.html"><img src="http://research.swtch.com/applied_geometry.jpg"></a>
<br>
<i><a href="http://www.robertwechsler.com/projects.html">http://www.robertwechsler.com/projects.html</a></i>
<br><br>
</center>
<p class=pp>
Let's try to write a Python program that prints itself.
It will probably be a <code>print</code> statement, so here's a first attempt,
run at the interpreter prompt:
</p>
<pre class=indent>
>>> print '<span style="color: #005500">hello</span>'
hello
</pre>
<p class=lp>
That didn't quite work. But now we know what the program is, so let's print it:
</p>
<pre class=indent>
>>> print "<span style="color: #005500">print 'hello'</span>"
print 'hello'
</pre>
<p class=lp>
That didn't quite work either. The problem is that when you execute
a simple print statement, it only prints part of itself: the argument to the print.
We need a way to print the rest of the program too.
</p>
<p class=pp>
The trick is to use recursion: you write a string that is the whole program,
but with itself missing, and then you plug it into itself before passing it to print.
</p>
<pre class=indent>
>>> s = '<span style="color: #005500">print %s</span>'; print s % repr(s)
print 'print %s'
</pre>
<p class=lp>
Not quite, but closer: the problem is that the string <code>s</code> isn't actually
the program. But now we know the general form of the program:
<code>s = '<span style="color: #005500">%s</span>'; print s % repr(s)</code>.
That's the string to use.
</p>
<pre class=indent>
>>> s = '<span style="color: #005500">s = %s; print s %% repr(s)</span>'; print s % repr(s)
s = 's = %s; print s %% repr(s)'; print s % repr(s)
</pre>
<p class=lp>
Recursion for the win.
</p>
<p class=pp>
This form of self-reproducing program is often called a <a href="http://en.wikipedia.org/wiki/Quine_(computing)">quine</a>,
in honor of the philosopher and logician W. V. O. Quine,
who discovered the paradoxical sentence:
</p>
<blockquote>
“Yields falsehood when preceded by its quotation”<br>yields falsehood when preceded by its quotation.
</blockquote>
<p class=lp>
The simplest English form of a self-reproducing quine is a command like:
</p>
<blockquote>
Print this, followed by its quotation:<br>“Print this, followed by its quotation:”
</blockquote>
<p class=lp>
There's nothing particularly special about Python that makes quining possible.
The most elegant quine I know is a Scheme program that is a direct, if somewhat inscrutable, translation of that
sentiment:
</p>
<pre class=indent>
((lambda (x) `<span style="color: #005500">(</span>,x <span style="color: #005500">'</span>,x<span style="color: #005500">)</span>)
'<span style="color: #005500">(lambda (x) `(,x ',x))</span>)
</pre>
<p class=lp>
I think the Go version is a clearer translation, at least as far as the quoting is concerned:
</p>
<pre class=indent>
/* Go quine */
package main
import "<span style="color: #005500">fmt</span>"
func main() {
fmt.Printf("<span style="color: #005500">%s%c%s%c\n</span>", q, 0x60, q, 0x60)
}
var q = `<span style="color: #005500">/* Go quine */
package main
import "fmt"
func main() {
fmt.Printf("%s%c%s%c\n", q, 0x60, q, 0x60)
}
var q = </span>`
</pre>
<p class=lp>(I've colored the data literals green throughout to make it clear what is program and what is data.)</p>
<p class=pp>The Go program has the interesting property that, ignoring the pesky newline
at the end, the entire program is the same thing twice (<code>/* Go quine */ ... q = `</code>).
That got me thinking: maybe it's possible to write a self-reproducing program
using only a repetition operator.
And you know what programming language has essentially only a repetition operator?
The language used to encode Lempel-Ziv compressed files
like the ones used by <code>gzip</code> and <code>zip</code>.
</p>
<h3>Self-reproducing Lempel-Ziv programs</h3>
<p class=pp>
Lempel-Ziv compressed data is a stream of instructions with two basic
opcodes: <code>literal(</code><i>n</i><code>)</code> followed by
<i>n</i> bytes of data means write those <i>n</i> bytes into the
decompressed output,
and <code>repeat(</code><i>d</i><code>,</code> <i>n</i><code>)</code>
means look backward <i>d</i> bytes from the current location
in the decompressed output and copy the <i>n</i> bytes you find there
into the output stream.
</p>
<p class=pp>
The programming exercise, then, is this: write a Lempel-Ziv program
using just those two opcodes that prints itself when run.
In other words, write a compressed data stream that decompresses to itself.
Feel free to assume any reasonable encoding for the <code>literal</code>
and <code>repeat</code> opcodes.
For the grand prize, find a program that decompresses to
itself surrounded by an arbitrary prefix and suffix,
so that the sequence could be embedded in an actual <code>gzip</code>
or <code>zip</code> file, which has a fixed-format header and trailer.
</p>
<p class=pp>
<b>Spoiler alert!</b>
I urge you to try this on your own before continuing to read.
It's a great way to spend a lazy afternoon, and you have
one critical advantage that I didn't: you know there is a solution.
</p>
<center>
<br><br>
<i>(Spoiler blocker.)</i>
<br>
<a href=""><img src="http://research.swtch.com/the_best_circular_bike(sbcc_sbma_students_roof).jpg"></a>
<br>
<i><a href="http://www.robertwechsler.com/thebest.html">http://www.robertwechsler.com/thebest.html</a></i>
<br><br>
</center>
<p class=lp>By the way, here's <a href="http://swtch.com/r.gz"><code>r.gz</code></a>, gzip files all the way down.
<pre class=indent>
$ gunzip < r.gz > r
$ cmp r r.gz
$
</pre>
<p class=lp>The nice thing about <code>r.gz</code> is that even broken web browsers
that ordinarily decompress downloaded gzip data before storing it to disk
will handle this file correctly!
</p>
<p class=pp>Enough stalling to hide the spoilers.
Let's use this shorthand to describe Lempel-Ziv instructions:
<code>L</code><i>n</i> and <code>R</code><i>n</i> are
shorthand for <code>literal(</code><i>n</i><code>)</code> and
<code>repeat(</code><i>n</i><code>,</code> <i>n</i><code>)</code>,
and the program assumes that each code is one byte.
<code>L0</code> is therefore the Lempel-Ziv no-op;
<code>L5</code> <code>hello</code> prints <code>hello</code>;
and so does <code>L3</code> <code>hel</code> <code>R1</code> <code>L1</code> <code>o</code>.
</p>
<p class=pp>
Here's a Lempel-Ziv program that prints itself.
(Each line is one instruction.)
</p>
<br>
<center>
<table border=0>
<tr><th></th><th width=30></th><th>Code</th><th width=30></th><th>Output</th></tr>
<tr><td align=right><i><span style="font-size: 0.8em;">no-op</span></i></td><td></td><td><code>L0</code></td><td></td><td></td></tr>
<tr><td align=right><i><span style="font-size: 0.8em;">no-op</span></i></td><td></td><td><code>L0</code></td><td></td><td></td></tr>
<tr><td align=right><i><span style="font-size: 0.8em;">no-op</span></i></td><td></td><td><code>L0</code></td><td></td><td></td></tr>
<tr><td align=right><i><span style="font-size: 0.8em;">print 4 bytes</span></i></td><td></td><td><code>L4 <span style="color: #005500">L0 L0 L0 L4</span></code></td><td></td><td><code>L0 L0 L0 L4</code></td></tr>
<tr><td align=right><i><span style="font-size: 0.8em;">repeat last 4 printed bytes</span></i></td><td></td><td><code>R4</code></td><td></td><td><code>L0 L0 L0 L4</code></td></tr>
<tr><td align=right><i><span style="font-size: 0.8em;">print 4 bytes</span></i></td><td></td><td><code>L4 <span style="color: #005500">R4 L4 R4 L4</span></code></td><td></td><td><code>R4 L4 R4 L4</code></td></tr>
<tr><td align=right><i><span style="font-size: 0.8em;">repeat last 4 printed bytes</span></i></td><td></td><td><code>R4</code></td><td></td><td><code>R4 L4 R4 L4</code></td></tr>
<tr><td align=right><i><span style="font-size: 0.8em;">print 4 bytes</span></i></td><td></td><td><code>L4 <span style="color: #005500">L0 L0 L0 L0</span></code></td><td></td><td><code>L0 L0 L0 L0</code></td></tr>
</table>
</center>
<br>
<p class=lp>
(The two columns Code and Output contain the same byte sequence.)
</p>
<p class=pp>
The interesting core of this program is the 6-byte sequence
<code>L4 R4 L4 R4 L4 R4</code>, which prints the 8-byte sequence <code>R4 L4 R4 L4 R4 L4 R4 L4</code>.
That is, it prints itself with an extra byte before and after.
</p>
<p class=pp>
When we were trying to write the self-reproducing Python program,
the basic problem was that the print statement was always longer
than what it printed. We solved that problem with recursion,
computing the string to print by plugging it into itself.
Here we took a different approach.
The Lempel-Ziv program is
particularly repetitive, so that a repeated substring ends up
containing the entire fragment. The recursion is in the
representation of the program rather than its execution.
Either way, that fragment is the crucial point.
Before the final <code>R4</code>, the output lags behind the input.
Once it executes, the output is one code ahead.
</p>
<p class=pp>
The <code>L0</code> no-ops are plugged into
a more general variant of the program, which can reproduce itself
with the addition of an arbitrary three-byte prefix and suffix:
</p>
<br>
<center>
<table border=0>
<tr><th></th><th width=30></th><th>Code</th><th width=30></th><th>Output</th></tr>
<tr><td align=right><i><span style="font-size: 0.8em;">print 4 bytes</span></i></td><td></td><td><code>L4 <span style="color: #005500"><i>aa bb cc</i> L4</span></code></td><td></td><td><code><i>aa bb cc</i> L4</code></td></tr>
<tr><td align=right><i><span style="font-size: 0.8em;">repeat last 4 printed bytes</span></i></td><td></td><td><code>R4</code></td><td></td><td><code><i>aa bb cc</i> L4</code></td></tr>
<tr><td align=right><i><span style="font-size: 0.8em;">print 4 bytes</span></i></td><td></td><td><code>L4 <span style="color: #005500">R4 L4 R4 L4</span></code></td><td></td><td><code>R4 L4 R4 L4</code></td></tr>
<tr><td align=right><i><span style="font-size: 0.8em;">repeat last 4 printed bytes</span></i></td><td></td><td><code>R4</code></td><td></td><td><code>R4 L4 R4 L4</code></td></tr>
<tr><td align=right><i><span style="font-size: 0.8em;">print 4 bytes</span></i></td><td></td><td><code>L4 <span style="color: #005500">R4 <i>xx yy zz</i></span></code></td><td></td><td><code>R4 <i>xx yy zz</i></code></td></tr>
<tr><td align=right><i><span style="font-size: 0.8em;">repeat last 4 printed bytes</span></i></td><td></td><td><code>R4</code></td><td></td><td><code>R4 <i>xx yy zz</i></code></td></tr>
</table>
</center>
<br>
<p class=lp>
(The byte sequence in the Output column is <code><i>aa bb cc</i></code>, then
the byte sequence from the Code column, then <code><i>xx yy zz</i></code>.)
</p>
<p class=pp>
It took me the better part of a quiet Sunday to get this far,
but by the time I got here I knew the game was over
and that I'd won.
From all that experimenting, I knew it was easy to create
a program fragment that printed itself minus a few instructions
or even one that printed an arbitrary prefix
and then itself, minus a few instructions.
The extra <code>aa bb cc</code> in the output
provides a place to attach such a program fragment.
Similarly, it's easy to create a fragment to attach
to the <code>xx yy zz</code> that prints itself,
minus the first three instructions, plus an arbitrary suffix.
We can use that generality to attach an appropriate
header and trailer.
</p>
<p class=pp>
Here is the final program, which prints itself surrounded by an
arbitrary prefix and suffix.
<code>[P]</code> denotes the <i>p</i>-byte compressed form of the prefix <code>P</code>;
similarly, <code>[S]</code> denotes the <i>s</i>-byte compressed form of the suffix <code>S</code>.
</p>
<br>
<center>
<table border=0>
<tr><th></th><th width=30></th><th>Code</th><th width=30></th><th>Output</th></tr>
<tr>
<td align=right><i><span style="font-size: 0.8em;">print prefix</span></i></td>
<td></td>
<td><code>[P]</code></td>
<td></td>
<td><code>P</code></td>
</tr>
<tr>
<td align=right><span style="font-size: 0.8em;"><i>print </i>p<i>+1 bytes</i></span></td>
<td></td>
<td><code>L</code><span style="font-size: 0.8em;"><i>p</i>+1</span><code> <span style="color: #005500">[P] L</span></code><span style="color: #005500"><span style="font-size: 0.8em;"><i>p</i>+1</span></span><code></code></td>
<td></td>
<td><code>[P] L</code><span style="font-size: 0.8em;"><i>p</i>+1</span><code></code></td>
</tr>
<tr>
<td align=right><span style="font-size: 0.8em;"><i>repeat last </i>p<i>+1 printed bytes</i></span></td>
<td></td>
<td><code>R</code><span style="font-size: 0.8em;"><i>p</i>+1</span><code></code></td>
<td></td>
<td><code>[P] L</code><span style="font-size: 0.8em;"><i>p</i>+1</span><code></code></td>
</tr>
<tr>
<td align=right><span style="font-size: 0.8em;"><i>print 1 byte</i></span></td>
<td></td>
<td><code>L1 <span style="color: #005500">R</span></code><span style="color: #005500"><span style="font-size: 0.8em;"><i>p</i>+1</span></span><code></code></td>
<td></td>
<td><code>R</code><span style="font-size: 0.8em;"><i>p</i>+1</span><code></code></td>
</tr>
<tr>
<td align=right><span style="font-size: 0.8em;"><i>print 1 byte</i></span></td>
<td></td>
<td><code>L1 <span style="color: #005500">L1</span></code></td>
<td></td>
<td><code>L1</code></td>
</tr>
<tr>
<td align=right><i><span style="font-size: 0.8em;">print 4 bytes</span></i></td>
<td></td>
<td><code>L4 <span style="color: #005500">R</span></code><span style="color: #005500"><span style="font-size: 0.8em;"><i>p</i>+1</span></span><code><span style="color: #005500"> L1 L1 L4</span></code></td>
<td></td>
<td><code>R</code><span style="font-size: 0.8em;"><i>p</i>+1</span><code> L1 L1 L4</code></td>
</tr>
<tr>
<td align=right><i><span style="font-size: 0.8em;">repeat last 4 printed bytes</span></i></td>
<td></td>
<td><code>R4</code></td>
<td></td>
<td><code>R</code><span style="font-size: 0.8em;"><i>p</i>+1</span><code> L1 L1 L4</code></td>
</tr>
<tr>
<td align=right><i><span style="font-size: 0.8em;">print 4 bytes</span></i></td>
<td></td>
<td><code>L4 <span style="color: #005500">R4 L4 R4 L4</span></code></td>
<td></td>
<td><code>R4 L4 R4 L4</code></td>
</tr>
<tr>
<td align=right><i><span style="font-size: 0.8em;">repeat last 4 printed bytes</span></i></td>
<td></td>
<td><code>R4</code></td>
<td></td>
<td><code>R4 L4 R4 L4</code></td>
</tr>
<tr>
<td align=right><i><span style="font-size: 0.8em;">print 4 bytes</span></i></td>
<td></td>
<td><code>L4 <span style="color: #005500">R4 L0 L0 L</span></code><span style="color: #005500"><span style="font-size: 0.8em;"><i>s</i>+1</span></span><code><span style="color: #005500"></span></code></td>
<td></td>
<td><code>R4 L0 L0 L</code><span style="font-size: 0.8em;"><i>s</i>+1</span><code></code></td>
</tr>
<tr>
<td align=right><i><span style="font-size: 0.8em;">repeat last 4 printed bytes</span></i></td>
<td></td>
<td><code>R4</code></td>
<td></td>
<td><code>R4 L0 L0 L</code><span style="font-size: 0.8em;"><i>s</i>+1</span><code></code></td>
</tr>
<tr>
<td align=right><i><span style="font-size: 0.8em;">no-op</span></i></td>
<td></td>
<td><code>L0</code></td>
<td></td>
<td></td>
</tr>
<tr>
<td align=right><i><span style="font-size: 0.8em;">no-op</span></i></td>
<td></td>
<td><code>L0</code></td>
<td></td>
<td></td>
</tr>
<tr>
<td align=right><span style="font-size: 0.8em;"><i>print </i>s<i>+1 bytes</i></span></td>
<td></td>
<td><code>L</code><span style="font-size: 0.8em;"><i>s</i>+1</span><code> <span style="color: #005500">R</span></code><span style="color: #005500"><span style="font-size: 0.8em;"><i>s</i>+1</span></span><code><span style="color: #005500"> [S]</span></code></td>
<td></td>
<td><code>R</code><span style="font-size: 0.8em;"><i>s</i>+1</span><code> [S]</code></td>
</tr>
<tr>
<td align=right><span style="font-size: 0.8em;"><i>repeat last </i>s<i>+1 bytes</i></span></td>
<td></td>
<td><code>R</code><span style="font-size: 0.8em;"><i>s</i>+1</span><code></code></td>
<td></td>
<td><code>R</code><span style="font-size: 0.8em;"><i>s</i>+1</span><code> [S]</code></td>
</tr>
<tr>
<td align=right><i><span style="font-size: 0.8em;">print suffix</span></i></td>
<td></td>
<td><code>[S]</code></td>
<td></td>
<td><code>S</code></td>
</tr>
</table>
</center>
<br>
<p class=lp>
(The byte sequence in the Output column is <code><i>P</i></code>, then
the byte sequence from the Code column, then <code><i>S</i></code>.)
</p>
<h3>Self-reproducing zip files</h3>
<p class=pp>
Now the rubber meets the road.
We've solved the main theoretical obstacle to making a self-reproducing
zip file, but there are a couple practical obstacles
still in our way.
</p>
<p class=pp>
The first obstacle is to translate our self-reproducing Lempel-Ziv program,
written in simplified opcodes, into the real opcode encoding.
<a href="http://www.ietf.org/rfc/rfc1951.txt">RFC 1951</a> describes the DEFLATE format used in both gzip and zip: a sequence of blocks, each of which
is a sequence of opcodes encoded using Huffman codes.
Huffman codes assign different length bit strings
to different opcodes,
breaking our assumption above that opcodes have
fixed length.
But wait!
We can, with some care, find a set of fixed-size encodings
that says what we need to be able to express.
</p>
<p class=pp>
In DEFLATE, there are literal blocks and opcode blocks.
The header at the beginning of a literal block is 5 bytes:
</p>
<center>
<img src="http://research.swtch.com/zip1.png">
</center>
<p class=pp>
If the translation of our <code>L</code> opcodes above
are 5 bytes each, the translation of the <code>R</code> opcodes
must also be 5 bytes each, with all the byte counts
above scaled by a factor of 5.
(For example, <code>L4</code> now has a 20-byte argument,
and <code>R4</code> repeats the last 20 bytes of output.)
The opcode block
with a single <code>repeat(20,20)</code> instruction falls well short of
5 bytes:
</p>
<center>
<img src="http://research.swtch.com/zip2.png">
</center>
<p class=lp>Luckily, an opcode block containing two
<code>repeat(20,10)</code> instructions has the same effect and is exactly 5 bytes:
</p>
<center>
<img src="http://research.swtch.com/zip3.png">
</center>
<p class=lp>
Encoding the other sized repeats
(<code>R</code><span style="font-size: 0.8em;"><i>p</i>+1</span> and
<code>R</code><span style="font-size: 0.8em;"><i>s</i>+1</span>)
takes more effort
and some sleazy tricks, but it turns out that
we can design 5-byte codes that repeat any amount
from 9 to 64 bytes.
For example, here are the repeat blocks for 10 bytes and for 40 bytes:
</p>
<center>
<img src="http://research.swtch.com/zip4.png">
<br>
<img src="http://research.swtch.com/zip5.png">
</center>
<p class=lp>
The repeat block for 10 bytes is two bits too short,
but every repeat block is followed by a literal block,
which starts with three zero bits and then padding
to the next byte boundary.
If a repeat block ends two bits short of a byte
but is followed by a literal block, the literal block's
padding will insert the extra two bits.
Similarly, the repeat block for 40 bytes is five bits too long,
but they're all zero bits.
Starting a literal block five bits too late
steals the bits from the padding.
Both of these tricks only work because the last 7 bits of
any repeat block are zero and the bits in the first byte
of any literal block are also zero,
so the boundary isn't directly visible.
If the literal block started with a one bit,
this sleazy trick wouldn't work.
</p>
<p class=pp>The second obstacle is that zip archives (and gzip files)
record a CRC32 checksum of the uncompressed data.
Since the uncompressed data is the zip archive,
the data being checksummed includes the checksum itself.
So we need to find a value <i>x</i> such that writing <i>x</i> into
the checksum field causes the file to checksum to <i>x</i>.
Recursion strikes back.
</p>
<p class=pp>
The CRC32 checksum computation interprets the entire file as a big number and computes
the remainder when you divide that number by a specific constant
using a specific kind of division.
We could go through the effort of setting up the appropriate
equations and solving for <i>x</i>.
But frankly, we've already solved one nasty recursive puzzle
today, and <a href="http://www.youtube.com/watch?v=TQBLTB5f3j0">enough is enough</a>.
There are only four billion possibilities for <i>x</i>:
we can write a program to try each in turn, until it finds one that works.
</p>
<p class=pp>
If you want to recreate these files yourself, there are a
few more minor obstacles, like making sure the tar file is a multiple
of 512 bytes and compressing the rather large zip trailer to
at most 59 bytes so that <code>R</code><span style="font-size: 0.8em;"><i>s</i>+1</span> is
at most <code>R</code><span style="font-size: 0.8em;">64</span>.
But they're just a simple matter of programming.
</p>
<p class=pp>
So there you have it:
<code><a href="http://swtch.com/r.gz">r.gz</a></code> (gzip files all the way down),
<code><a href="http://swtch.com/r.tar.gz">r.tar.gz</a></code> (gzipped tar files all the way down),
and
<code><a href="http://swtch.com/r.zip">r.zip</a></code> (zip files all the way down).
I regret that I have been unable to find any programs
that insist on decompressing these files recursively, ad infinitum.
It would have been fun to watch them squirm, but
it looks like much less sophisticated
<a href="http://en.wikipedia.org/wiki/Zip_bomb">zip bombs</a> have spoiled the fun.
</p>
<p class=pp>
If you're feeling particularly ambitious, here is
<a href="http://swtch.com/rgzip.go">rgzip.go</a>,
the <a href="http://golang.org/">Go</a> program that generated these files.
I wonder if you can create a zip file that contains a gzipped tar file
that contains the original zip file.
Ken Thompson suggested trying to make a zip file that
contains a slightly larger copy of itself, recursively,
so that as you dive down the chain of zip files
each one gets a little bigger.
(If you do manage either of these, please leave a comment.)
</p>
<br>
<p class=lp><font size=-1>P.S. I can't end the post without sharing my favorite self-reproducing program: the one-line shell script <code>#!/bin/cat</code></font>.
</p></p>
</div>
</div>
</div>
UTF-8: Bits, Bytes, and Benefits
tag:research.swtch.com,2012:research.swtch.com/utf8
2010-03-05T00:00:00-05:00
2010-03-05T00:00:00-05:00
The reasons to switch to UTF-8
<p><p class=pp>
UTF-8 is a way to encode Unicode code points—integer values from
0 through 10FFFF—into a byte stream,
and it is far simpler than many people realize.
The easiest way to make it confusing or complicated
is to treat it as a black box, never looking inside.
So let's start by looking inside. Here it is:
</p>
<center>
<table cellspacing=5 cellpadding=0 border=0>
<tr height=10><th colspan=4></th></tr>
<tr><th align=center colspan=2>Unicode code points</th><th width=10><th align=center>UTF-8 encoding (binary)</th></tr>
<tr height=10><td colspan=4></td></tr>
<tr><td align=right>00-7F</td><td>(7 bits)</td><td></td><td align=right>0<i>tuvwxyz</i></td></tr>
<tr><td align=right>0080-07FF</td><td>(11 bits)</td><td></td><td align=right>110<i>pqrst</i> 10<i>uvwxyz</i></td></tr>
<tr><td align=right>0800-FFFF</td><td>(16 bits)</td><td></td><td align=right>1110<i>jklm</i> 10<i>npqrst</i> 10<i>uvwxyz</i></td></tr>
<tr><td align=right valign=top>010000-10FFFF</td><td>(21 bits)</td><td></td><td align=right valign=top>11110<i>efg</i> 10<i>hijklm</i> 10<i>npqrst</i> 10<i>uvwxyz</i></td>
<tr height=10><td colspan=4></td></tr>
</table>
</center>
<p class=lp>
The convenient properties of UTF-8 are all consequences of the choice of encoding.
</p>
<ol>
<li><i>All ASCII files are already UTF-8 files.</i><br>
The first 128 Unicode code points are the 7-bit ASCII character set,
and UTF-8 preserves their one-byte encoding.
</li>
<li><i>ASCII bytes always represent themselves in UTF-8 files. They never appear as part of other UTF-8 sequences.</i><br>
All the non-ASCII UTF-8 sequences consist of bytes
with the high bit set, so if you see the byte 0x7A in a UTF-8 file,
you can be sure it represents the character <code>z</code>.
</li>
<li><i>ASCII bytes are always represented as themselves in UTF-8 files. They cannot be hidden inside multibyte UTF-8 sequences.</i><br>
The ASCII <code>z</code> 01111010 cannot be encoded as a two-byte UTF-8 sequence
11000001 10111010</code>. Code points must be encoded using the shortest
possible sequence.
A corollary is that decoders must detect long-winded sequences as invalid.
In practice, it is useful for a decoder to use the Unicode replacement
character, code point FFFD, as the decoding of an invalid UTF-8 sequence
rather than stop processing the text.
</li>
<li><i>UTF-8 is self-synchronizing.</i><br>
Let's call a byte of the form 10<i>xxxxxx</i>
a continuation byte.
Every UTF-8 sequence is a byte that is not a continuation byte
followed by zero or more continuation bytes.
If you start processing a UTF-8 file at an arbitrary point,
you might not be at the beginning of a UTF-8 encoding,
but you can easily find one: skip over
continuation bytes until you find a non-continuation byte.
(The same applies to scanning backward.)
</li>
<li><i>Substring search is just byte string search.</i><br>
Properties 2, 3, and 4 imply that given a string
of correctly encoded UTF-8, the only way those bytes
can appear in a larger UTF-8 text is when they represent the
same code points. So you can use any 8-bit safe byte at a time
search function, like <code>strchr</code> or <code>strstr</code>, to run the search.
</li>
<li><i>Most programs that handle 8-bit files safely can handle UTF-8 safely.</i><br>
This also follows from Properties 2, 3, and 4.
I say “most” programs, because programs that
take apart a byte sequence expecting one character per byte
will not behave correctly, but very few programs do that.
It is far more common to split input at newline characters,
or split whitespace-separated fields, or do other similar parsing
around specific ASCII characters.
For example, Unix tools like cat, cmp, cp, diff, echo, head, tail, and tee
can process UTF-8 files as if they were plain ASCII files.
Most operating system kernels should also be able to handle
UTF-8 file names without any special arrangement, since the
only operations done on file names are comparisons
and splitting at <code>/</code>.
In contrast, tools like grep, sed, and wc, which inspect arbitrary
individual characters, do need modification.
</li>
<li><i>UTF-8 sequences sort in code point order.</i><br>
You can verify this by inspecting the encodings in the table above.
This means that Unix tools like join, ls, and sort (without options) don't need to handle
UTF-8 specially.
</li>
<li><i>UTF-8 has no “byte order.”</i><br>
UTF-8 is a byte encoding. It is not little endian or big endian.
Unicode defines a byte order mark (BOM) code point FFFE,
which are used to determine the byte order of a stream of
raw 16-bit values, like UCS-2 or UTF-16.
It has no place in a UTF-8 file.
Some programs like to write a UTF-8-encoded BOM
at the beginning of UTF-8 files, but this is unnecessary
(and annoying to programs that don't expect it).
</li>
</ol>
<p class=lp>
UTF-8 does give up the ability to do random
access using code point indices.
Programs that need to jump to the <i>n</i>th
Unicode code point in a file or on a line—text editors are the canonical example—will
typically convert incoming UTF-8 to an internal representation
like an array of code points and then convert back to UTF-8
for output,
but most programs are simpler when written to manipulate UTF-8 directly.
</p>
<p class=pp>
Programs that make UTF-8 more complicated than it needs to be
are typically trying to be too general,
not wanting to make assumptions that might not be true of
other encodings.
But there are good tools to convert other encodings to UTF-8,
and it is slowly becoming the standard encoding:
even the fraction of web pages
written in UTF-8 is
<a href="http://googleblog.blogspot.com/2010/01/unicode-nearing-50-of-web.html">nearing 50%</a>.
UTF-8 was explicitly designed
to have these nice properties. Take advantage of them.
</p>
<p class=pp>
For more on UTF-8, see “<a href="http://plan9.bell-labs.com/sys/doc/utf.html">Hello World
or
Καλημέρα κόσμε
or
こんにちは 世界</a>,” by Rob Pike
and Ken Thompson, and also this <a href="http://www.cl.cam.ac.uk/~mgk25/ucs/utf-8-history.txt">history</a>.
</p>
<br>
<font size=-1>
<p class=lp>
Notes: Property 6 assumes the tools do not strip the high bit from each byte.
Such mangling was common years ago but is very uncommon now.
Property 7 assumes the comparison is done treating
the bytes as unsigned, but such behavior is mandated
by the ANSI C standard for <code>memcmp</code>,
<code>strcmp</code>, and <code>strncmp</code>.
</p>
</font></p>
Computing History at Bell Labs
tag:research.swtch.com,2012:research.swtch.com/bell-labs
2008-04-09T00:00:00-04:00
2008-04-09T00:00:00-04:00
Doug McIlroy’s rememberances
<p><p class=pp>
In 1997, on his retirement from Bell Labs, <a href="http://www.cs.dartmouth.edu/~doug/">Doug McIlroy</a> gave a
fascinating talk about the “<a href="https://web.archive.org/web/20081022192943/http://cm.bell-labs.com/cm/cs/doug97.html"><b>History of Computing at Bell Labs</b></a>.”
Almost ten years ago I transcribed the audio but never did anything with it.
The transcript is below.
</p>
<p class=pp>
My favorite parts of the talk are the description of the bi-quinary decimal relay calculator
and the description of a team that spent over a year tracking down a race condition bug in
a missile detector (reliability was king: today you’d just stamp
“cannot reproduce” and send the report back).
But the whole thing contains many fantastic stories.
It’s well worth the read or listen.
I also like his recollection of programming using cards: “It’s the kind of thing you can be nostalgic about, but it wasn’t actually fun.”
</p>
<p class=pp>
For more information, Bernard D. Holbrook and W. Stanley Brown’s 1982
technical report
“<a href="cstr99.pdf">A History of Computing Research at Bell Laboratories (1937-1975)</a>”
covers the earlier history in more detail.
</p>
<p><i>Corrections added August 19, 2009. Links updated May 16, 2018.</i></p>
<p><i>Update, December 19, 2020.</i> The original audio files disappeared along with the rest of the Bell Labs site some time ago, but I discovered a saved copy on one of my computers: [<a href="mcilroy97history.mp3">MP3</a> | <a href="mcilroy97history.rm">original RealAudio</a>].
I also added a few corrections and notes from Doug McIlroy, dated 2015 [sic].</p>
<br>
<br>
<p class=lp><b>Transcript</b></p>
<p class=pp>
Computing at Bell Labs is certainly an outgrowth of the
<a href="https://web.archive.org/web/20080622172015/http://cm.bell-labs.com/cm/ms/history/history.html">mathematics department</a>, which grew from that first hiring
in 1897, G A Campbell. When Bell Labs was formally founded
in 1925, what it had been was the engineering department
of Western Electric.
When it was formally founded in 1925,
almost from the beginning there was a math department with Thornton Fry as the department head, and if you look at some of Fry’s work, it turns out that
he was fussing around in 1929 with trying to discover
information theory. It didn’t actually gel until twenty years later with Shannon.</p>
<p class=pp><span style="font-size: 0.7em;">1:10</span>
Of course, most of the mathematics at that time was continuous.
One was interested in analyzing circuits and propagation. And indeed, this is what led to the growth of computing in Bell Laboratories. The computations could not all be done symbolically. There were not closed form solutions. There was lots of numerical computation done.
The math department had a fair stable of computers,
which in those days meant people. [laughter]</p>
<p class=pp><span style="font-size: 0.7em;">2:00</span>
And in the late ’30s, <a href="http://en.wikipedia.org/wiki/George_Stibitz">George Stibitz</a> had an idea that some of
the work that they were doing on hand calculators might be
automated by using some of the equipment that the Bell System
was installing in central offices, namely relay circuits.
He went home, and on his kitchen table, he built out of relays
a binary arithmetic circuit. He decided that binary was really
the right way to compute.
However, when he finally came to build some equipment,
he determined that binary to decimal conversion and
decimal to binary conversion was a drag, and he didn’t
want to put it in the equipment, and so he finally built
in 1939, a relay calculator that worked in decimal,
and it worked in complex arithmetic.
Do you have a hand calculator now that does complex arithmetic?
Ten-digit, I believe, complex computations: add, subtract,
multiply, and divide.
The I/O equipment was teletypes, so essentially all the stuff to make such
machines out of was there.
Since the I/O was teletypes, it could be remotely accessed,
and there were in fact four stations in the West Street Laboratories
of Bell Labs. West Street is down on the left side of Manhattan.
I had the good fortune to work there one summer, right next to a
district where you’re likely to get bowled over by rolling beeves hanging from racks or tumbling cabbages. The building is still there. It’s called <a href="http://query.nytimes.com/gst/fullpage.html?res=950DE3DB1F38F931A35751C0A96F948260">Westbeth Apartments</a>. It’s now an artist’s colony.</p>
<p class=pp><span style="font-size: 0.7em;">4:29</span>
Anyway, in West Street, there were four separate remote stations from which the complex calculator could be accessed. It was not time sharing. You actually reserved your time on the machine, and only one of the four terminals worked at a time.
In 1940, this machine was shown off to the world at the AMS annual convention, which happened to be held in Hanover at Dartmouth that year, and mathematicians could wonder at remote computing, doing computation on an electromechanical calculator at 300 miles away.</p>
<p class=pp><span style="font-size: 0.7em;">5:22</span>
Stibitz went on from there to make a whole series of relay machines. Many of them were made for the government during the war. They were named, imaginatively, Mark I through Mark VI.
I have read some of his patents. They’re kind of fun. One is a patent on conditional transfer. [laughter] And how do you do a conditional transfer?
Well these gadgets were, the relay calculator was run from your fingers, I mean the complex calculator.
The later calculators, of course, if your fingers were a teletype, you could perfectly well feed a paper tape in,
because that was standard practice. And these later machines were intended really to be run more from paper tape.
And the conditional transfer was this: you had two teletypes, and there’s a code that says "time to read from the other teletype". Loops were of course easy to do. You take paper and [laughter; presumably Doug curled a piece of paper to form a physical loop].
These machines never got to the point of having stored programs.
But they got quite big. I saw, one of them was here in 1954, and I did see it, behind glass, and if you’ve ever seen these machines in the, there’s one in the Franklin Institute in Philadelphia, and there’s one in the Science Museum in San Jose, you know these machines that drop balls that go wandering sliding around and turning battle wheels and ringing bells and who knows what. It kind of looked like that.
It was a very quiet room, with just a little clicking of relays, which is what a central office used to be like. It was the one air-conditioned room in Murray Hill, I think. This machine ran, the Mark VI, well I think that was the Mark V, the Mark VI actually went to Aberdeen.
This machine ran for a good number of years, probably six, eight.
And it is said that it never made an undetected error. [laughter]</p>
<p class=pp><span style="font-size: 0.7em;">8:30</span>
What that means is that it never made an error that it did not diagnose itself and stop.
Relay technology was very very defensive. The telephone switching system had to work. It was full of self-checking,
and so were the calculators, so were the calculators that Stibitz made.</p>
<p class=pp><span style="font-size: 0.7em;">9:04</span>
Arithmetic was done in bi-quinary, a two out of five representation for decimal integers, and if there weren’t exactly two out of five relays activated it would stop.
This machine ran unattended over the weekends. People would
bring their tapes in, and the operator would paste everybody’s tapes together.
There was a beginning of job code on the tape and there was also a time indicator.
If the machine ran out of time, it automatically stopped and went to the next job. If the machine caught itself in an error, it backed up to the current job and tried it again.
They would load this machine on Friday night, and on Monday morning, all the tapes, all the entries would be available on output tapes.</p>
<p class=pp>Question: I take it they were using a different representation for loops
and conditionals by then.</p>
<p class=pp>Doug: Loops were done actually by they would run back and forth across the tape now, on this machine.</p>
<p class=pp><span style="font-size: 0.7em;">10:40</span>
Then came the transistor in ’48.
At Whippany, they actually had a transistorized computer, which was a respectable minicomputer, a box about this big, running in 1954, it ran from 1954 to 1956 solidly as a test run.
The notion was that this computer might fly in an airplane.
And during that two-year test run, one diode failed.
In 1957, this machine called <a href="http://www.cedmagic.com/history/tradic-transistorized.html">TRADIC</a>, did in fact fly in an airplane, but to the best of my knowledge, that machine was a demonstration machine. It didn’t turn into a production machine.
About that time, we started buying commercial machines.
It’s wonderful to think about the set of different architectures that existed in that time. The first machine we got was called a <a href="http://www.columbia.edu/acis/history/cpc.html">CPC from IBM</a>. And all it was was a big accounting machine with a very special plugboard on the side that provided an interpreter for doing ten-digit decimal arithmetic, including
opcodes for the trig functions and square root.</p>
<p class=pp><span style="font-size: 0.7em;">12:30</span>
It was also not a computer as we know it today,
because it wasn’t stored program, it had twenty-four memory locations as I recall, and it took its program instead of from tapes, from cards. This was not a total advantage. A tape didn’t get into trouble if you dropped it on the floor. [laughter].
CPC, the operator would stand in front of it, and there, you
would go through loops by taking cards out, it took human intervention, to take the cards out of the output of the card reader and put them in the ?top?. I actually ran some programs on the CPC ?...?. It’s the kind of thing you can be nostalgic about, but it wasn’t actually fun.
[laughter]</p>
<p class=pp><span style="font-size: 0.7em;">13:30</span>
The next machine was an <a href="http://www.columbia.edu/acis/history/650.html">IBM 650</a>, and here, this was a stored program, with the memory being on drum. There was no operating system for it. It came with a manual: this is what the machine does. And Michael Wolontis made an interpreter called the <a href="http://hopl.info/showlanguage2.prx?exp=6497">L1 interpreter</a> for this machine, so you could actually program in, the manual told you how to program in binary, and L1 allowed you to give something like 10 for add and 9 for subtract, and program in decimal instead. And of course that machine required interesting optimization, because it was a nice thing if the next program step were stored somewhere -- each program step had the address of the following step in it, and you would try to locate them around the drum so to minimize latency. So there were all kinds of optimizers around, but I don’t think Bell Labs made ?...? based on this called "soap" from Carnegie Mellon. That machine didn’t last very long. Fortunately, a machine with core memory came out from IBM in about ’56, the 704. Bell Labs was a little slow in getting one, in ’58. Again, the machine came without an operating system.
In fact, but it did have Fortran, which really changed the world.
It suddenly made it easy to write programs. But the way Fortran came from IBM, it came with a thing called the Fortran Stop Book.
This was a list of what happened, a diagnostic would execute the halt instruction, the operator would go read the panel lights and discover where the machine had stopped, you would then go look up in the stop book what that meant.
Bell Labs, with George Mealy and Gwen Hanson, made an operating system, and one of the things they did was to bring the stop book to heel. They took the compiler, replaced all the stop instructions with jumps to somewhere, and allowed the program instead of stopping to go on to the next trial.
By the time I arrived at Bell Labs in 1958, this thing was running nicely.</p>
<p class=pp>[<i>McIlroy comments, 2015</i>: I’m pretty sure I was wrong in saying Mealy and Hanson brought
the stop book to heel. They built the OS, but I believe Dolores
Leagus tamed Fortran. (Dolores was the most accurate programmer I
ever knew. She’d write 2000 lines of code before testing a single
line--and it would work.)]</p>
<p class=pp><span style="font-size: 0.7em;">16:36</span>
Bell Labs continued to be a major player in operating systems.
This was called BESYS. BE was the SHARE abbreviation for Bell Labs. Each company that belonged to Share, which was the IBM users group, ahd a two letter abbreviation. It’s hard to imagine taking all the computer users now and giving them a two-letter abbreviation. BESYS went through many generations, up to BESYS 5, I believe. Each one with innovations. IBM delivered a machine, the 7090, in 1960. This machine had interrupts in it, but IBM didn’t use them. But BESYS did. And that sent IBM back to the drawing board to make it work. [Laughter]</p>
<p class=pp><span style="font-size: 0.7em;">17:48</span>
Rob Pike: It also didn’t have memory protection.</p>
<p class=pp>Doug: It didn’t have memory protection either, and a lot of people actually got IBM to put memory protection in the 7090, so that one could leave the operating system resident in the presence of a wild program, an idea that the PC didn’t discover until, last year or something like that. [laughter]</p>
<p class=pp>Big players then, <a href="http://en.wikipedia.org/wiki/Richard_Hamming">Dick Hamming</a>, a name that I’m sure everybody knows,
was sort of the numerical analysis guru, and a seer.
He liked to make outrageous predictions. He predicted in 1960, that half of Bell Labs was going to be busy doing something with computers eventually.
?...? exaggerating some ?...? abstract in his thought.
He was wrong.
Half was a gross underestimate. Dick Hamming retired twenty years ago, and just this June he completed his full twenty years term in the Navy, which entitles him again to retire from the Naval Postgraduate Institute in Monterey. Stibitz, incidentally died, I think within the last year.
He was doing medical instrumentation at Dartmouth essentially, near the end.</p>
<p class=pp>[<i>McIlroy comments, 2015</i>: I’m not sure what exact unintelligible words I uttered about Dick
Hamming. When he predicted that half the Bell Labs budget would
be related to computing in a decade, people scoffed in terms like
“that’s just Dick being himelf, exaggerating for effect”.]</p>
<p class=pp><span style="font-size: 0.7em;">20:00</span>
Various problems intrigued, besides the numerical problems, which in fact were stock in trade, and were the real justification for buying machines, until at least the ’70s I would say. But some non-numerical problems had begun to tickle the palette of the math department. Even G A Campbell got interested in graph theory, the reason being he wanted to think of all the possible ways you could take the three wires and the various parts of the telephone and connect them together, and try permutations to see what you could do about reducing sidetone by putting things into the various parts of the circuit, and devised every possibly way of connecting the telephone up. And that was sort of the beginning of combinatorics at Bell Labs. John Reardon, a mathematician parlayed this into a major subject. Two problems which are now deemed as computing problems, have intrigued the math department for a very long time, and those are the minimum spanning tree problem, and the wonderfully ?comment about Joe Kruskal, laughter?</p>
<p class=pp><span style="font-size: 0.7em;">21:50</span>
And in the 50s Bob Prim and Kruskal, who I don’t think worked on the Labs at that point, invented algorithms for the minimum spanning tree. Somehow or other, computer scientists usually learn these algorithms, one of the two at least, as Dijkstra’s algorithm, but he was a latecomer.</p>
<p class=pp>[<i>McIlroy comments, 2015</i>:
I erred in attributing Dijkstra’s algorithm to Prim and Kruskal. That
honor belongs to yet a third member of the math department: Ed
Moore. (Dijkstra’s algorithm is for shortest path, not spanning
tree.)]</p>
<p class=pp>Another pet was the traveling salesman. There’s been a long list of people at Bell Labs who played with that: Shen Lin and Ron Graham and David Johnson and dozens more, oh and ?...?. And then another problem is the Steiner minimum spanning tree, where you’re allowed to add points to the graph. Every one of these problems grew, actually had a justification in telephone billing. One jurisdiction or another would specify that the way you bill for a private line network was in one jurisdiction by the minimum spanning tree. In another jurisdiction, by the traveling salesman route. NP-completeness wasn’t a word in the vocabulary of lawmakers [laughter]. And the <a href="http://en.wikipedia.org/wiki/Steiner_tree">Steiner problem</a> came up because customers discovered they could beat the system by inventing offices in the middle of Tennessee that had nothing to do with their business, but they could put the office at a Steiner point and reduce their phone bill by adding to what the service that the Bell System had to give them. So all of these problems actually had some justification in billing besides the fun.</p>
<p class=pp><span style="font-size: 0.7em;">24:15</span>
Come the 60s, we actually started to hire people for computing per se. I was perhaps the third person who was hired with a Ph.D. to help take care of the computers and I’m told that the then director and head of the math department, Hendrick Bode, had said to his people, "yeah, you can hire this guy, instead of a real mathematician, but what’s he gonna be doing in five years?" [laughter]</p>
<p class=pp><span style="font-size: 0.7em;">25:02</span>
Nevertheless, we started hiring for real in about ’67. Computer science got split off from the math department. I had the good fortune to move into the office that I’ve been in ever since then. Computing began to make, get a personality of its own. One of the interesting people that came to Bell Labs for a while was Hao Wang. Is his name well known? [Pause] One nod. Hao Wang was a philosopher and logician, and we got a letter from him in England out of the blue saying "hey you know, can I come and use your computers? I have an idea about theorem proving." There was theorem proving in the air in the late 50s, and it was mostly pretty thin stuff. Obvious that the methods being proposed wouldn’t possibly do anything more difficult than solve tic-tac-toe problems by enumeration. Wang had a notion that he could mechanically prove theorems in the style of Whitehead and Russell’s great treatise Principia Mathematica in the early patr of the century. He came here, learned how to program in machine language, and took all of Volume I of Principia Mathematica --
if you’ve ever hefted Principia, well that’s about all it’s good for, it’s a real good door stop. It’s really big. But it’s theorem after theorem after theorem in propositional calculus. Of course, there’s a decision procedure for propositional calculus, but he was proving them more in the style of Whitehead and Russell. And when he finally got them all coded and put them into the computer, he proved the entire contents of this immense book in eight minutes.
This was actually a neat accomplishment. Also that was the beginning of all the language theory. We hired people like <a href="http://www1.cs.columbia.edu/~aho/">Al Aho</a> and <a href="http://infolab.stanford.edu/~ullman/">Jeff Ullman</a>, who probed around every possible model of grammars, syntax, and all of the things that are now in the standard undergraduate curriculum, were pretty well nailed down here, on syntax and finite state machines and so on were pretty well nailed down in the 60s. Speaking of finite state machines, in the 50s, both Mealy and Moore, who have two of the well-known models of finite state machines, were here.</p>
<p class=pp><span style="font-size: 0.7em;">28:40</span>
During the 60s, we undertook an enormous development project in the guise of research, which was <a href="http://www.multicians.org/">MULTICS</a>, and it was the notion of MULTICS was computing was the public utility of the future. Machines were very expensive, and ?indeed? like you don’t own your own electric generator, you rely on the power company to do generation for you, and it was seen that this was a good way to do computing -- time sharing -- and it was also recognized that shared data was a very good thing. MIT pioneered this and Bell Labs joined in on the MULTICS project, and this occupied five years of system programming effort, until Bell Labs pulled out, because it turned out that MULTICS was too ambitious for the hardware at the time, and also with 80 people on it was not exactly a research project. But, that led to various people who were on the project, in particular <a href="http://en.wikipedia.org/wiki/Ken_Thompson">Ken Thompson</a> -- right there -- to think about how to -- <a href="http://en.wikipedia.org/wiki/Dennis_Ritchie">Dennis Ritchie</a> and Rudd Canaday were in on this too -- to think about how you might make a pleasant operating system with a little less resources.</p>
<p class=pp><span style="font-size: 0.7em;">30:30</span>
And Ken found -- this is a story that’s often been told, so I won’t go into very much of unix -- Ken found an old machine cast off in the corner, the <a href="http://en.wikipedia.org/wiki/GE-600_series">PDP-7</a>, and put up this little operating system on it, and we had immense <a href="http://en.wikipedia.org/wiki/GE-600_series">GE635</a> available at the comp center at the time, and I remember as the department head, muscling in to use this little computer to be, to get to be Unix’s first user, customer, because it was so much pleasanter to use this tiny machine than it was to use the big and capable machine in the comp center. And of course the rest of the story is known to everybody and has affected all college campuses in the country.</p>
<p class=pp><span style="font-size: 0.7em;">31:33</span>
Along with the operating system work, there was a fair amount of language work done at Bell Labs. Often curious off-beat languages. One of my favorites was called <a href="http://hopl.murdoch.edu.au/showlanguage.prx?exp=6937&language=BLODI-B">Blodi</a>, B L O D I, a block diagram compiler by Kelly and Vyssotsky. Perhaps the most interesting early uses of computers in the sense of being unexpected, were those that came from the acoustics research department, and what the Blodi compiler was invented in the acoustic research department for doing digital simulations of sample data system. DSPs are classic sample data systems,
where instead of passing analog signals around, you pass around streams of numerical values. And Blodi allowed you to say here’s a delay unit, here’s an amplifier, here’s an adder, the standard piece parts for a sample data system, and each one was described on a card, and with description of what it’s wired to. It was then compiled into one enormous single straight line loop for one time step. Of course, you had to rearrange the code because some one part of the sample data system would feed another and produce really very efficient 7090 code for simulating sample data systems.
By in large, from that time forth, the acoustic department stopped making hardware. It was much easier to do signal processing digitally than previous ways that had been analog. Blodi had an interesting property. It was the only programming language I know where -- this is not my original observation, Vyssotsky said -- where you could take the deck of cards, throw it up the stairs, and pick them up at the bottom of the stairs, feed them into the computer again, and get the same program out. Blodi had two, aside from syntax diagnostics, it did have one diagnostic when it would fail to compile, and that was "somewhere in your system is a loop that consists of all delays or has no delays" and you can imagine how they handled that.</p>
<p class=pp><span style="font-size: 0.7em;">35:09</span>
Another interesting programming language of the 60s was <a href="http://www.knowltonmosaics.com/">Ken Knowlten</a>’s <a href="http://beflix.com/beflix.php">Beflix</a>. This was for making movies on something with resolution kind of comparable to 640x480, really coarse, and the
programming notion in here was bugs. You put on your grid a bunch of bugs, and each bug carried along some data as baggage,
and then you would do things like cellular automata operations. You could program it or you could kind of let it go by itself. If a red bug is next to a blue bug then it turns into a green bug on the following step and so on. <span style="font-size: 0.7em;">36:28</span> He and Lillian Schwartz made some interesting abstract movies at the time. It also did some interesting picture processing. One wonderful picture of a reclining nude, something about the size of that blackboard over there, all made of pixels about a half inch high each with a different little picture in it, picked out for their density, and so if you looked at it close up it consisted of pickaxes and candles and dogs, and if you looked at it far enough away, it was a <a href="http://blog.the-eg.com/2007/12/03/ken-knowlton-mosaics/">reclining nude</a>. That picture got a lot of play all around the country.</p>
<p class=pp>Lorinda Cherry: That was with Leon, wasn’t it? That was with <a href="https://en.wikipedia.org/wiki/Leon_Harmon">Leon Harmon</a>.</p>
<p class=pp>Doug: Was that Harmon?</p>
<p class=pp>Lorinda: ?...?</p>
<p class=pp>Doug: Harmon was also an interesting character. He did more things than pictures. I’m glad you reminded me of him. I had him written down here. Harmon was a guy who among other things did a block diagram compiler for writing a handwriting recognition program. I never did understand how his scheme worked, and in fact I guess it didn’t work too well. [laughter]
It didn’t do any production ?things? but it was an absolutely
immense sample data circuit for doing handwriting recognition.
Harmon’s most famous work was trying to estimate the information content in a face. And every one of these pictures which are a cliche now, that show a face digitized very coarsely, go back to Harmon’s <a href="https://web.archive.org/web/20080807162812/http://www.doubletakeimages.com/history.htm">first psychological experiments</a>, when he tried to find out how many bits of picture he needed to try to make a face recognizable. He went around and digitized about 256 faces from Bell Labs and did real psychological experiments asking which faces could be distinguished from other ones. I had the good fortune to have one of the most distinguishable faces, and consequently you’ll find me in freshman psychology texts through no fault of my own.</p>
<p class=pp><span style="font-size: 0.7em;">39:15</span>
Another thing going on the 60s was the halting beginning here of interactive computing. And again the credit has to go to the acoustics research department, for good and sufficient reason. They wanted to be able to feed signals into the machine, and look at them, and get them back out. They bought yet another weird architecture machine called the <a href="http://www.piercefuller.com/library/pb250.html">Packard Bell 250</a>, where the memory elements were <a href="http://en.wikipedia.org/wiki/Delay_line_memory">mercury delay lines</a>.</p>
<p class=pp>Question: Packard Bell?</p>
<p class=pp>Doug: Packard Bell, same one that makes PCs today.</p>
<p class=pp><span style="font-size: 0.7em;">40:10</span>
They hung this off of the comp center 7090 and put in a scheme for quickly shipping jobs into the job stream on the 7090. The Packard Bell was the real-time terminal that you could play with and repair stuff, ?...? off the 7090, get it back, and then you could play it. From that grew some graphics machines also, built by ?...? et al. And it was one of the old graphics machines
in fact that Ken picked up to build Unix on.</p>
<p class=pp><span style="font-size: 0.7em;">40:55</span>
Another thing that went on in the acoustics department was synthetic speech and music. <a href="http://csounds.com/mathews/index.html">Max Mathews</a>, who was the the director of the department has long been interested in computer music. In fact since retirement he spent a lot of time with Pierre Boulez in Paris at a wonderful institute with lots of money simply for making synthetic music. He had a language called Music 5. Synthetic speech or, well first of all simply speech processing was pioneered particularly by <a href="http://en.wikipedia.org/wiki/John_Larry_Kelly,_Jr">John Kelly</a>. I remember my first contact with speech processing. It was customary for computer operators, for the benefit of computer operators, to put a loudspeaker on the low bit of some register on the machine, and normally the operator would just hear kind of white noise. But if you got into a loop, suddenly the machine would scream, and this signal could be used to the operator "oh the machines in a loop. Go stop it and go on to the next job." I remember feeding them an Ackermann’s function routine once. [laughter] They were right. It was a silly loop. But anyway. One day, the operators were ?...?. The machine started singing. Out of the blue. “Help! I’m caught in a loop.”. [laughter] And in a broad Texas accent, which was the recorded voice of John Kelly.</p>
<p class=pp><span style="font-size: 0.7em;">43:14</span>
However. From there Kelly went on to do some speech synthesis. Of course there’s been a lot more speech synthesis work done since, by <span style="font-size: 0.7em;">43:31</span> folks like Cecil Coker, Joe Olive. But they produced a record, which unfortunately I can’t play because records are not modern anymore. And everybody got one in the Bell Labs Record, which is a magazine, contained once a record from the acoustics department, with both speech and music and one very famous combination where the computer played and sang "A Bicycle Built For Two".</p>
<p class=pp>?...?</p>
<p class=pp><span style="font-size: 0.7em;">44:32</span>
At the same time as all this stuff is going on here, needless
to say computing is going on in the rest of the Labs. it was about early 1960 when the math department lost its monopoly on computing machines and other people started buying them too, but for switching. The first experiments with switching computers were operational in around 1960. They were planned for several years prior to that; essentially as soon as the transistor was invented, the making of electronic rather than electromechanical switching machines was anticipated. Part of the saga of the switching machines is cheap memory. These machines had enormous memories -- thousands of words. [laughter] And it was said that the present worth of each word of memory that programmers saved across the Bell System was something like eleven dollars, as I recall. And it was worthwhile to struggle to save some memory. Also, programs were permanent. You were going to load up the switching machine with switching program and that was going to run. You didn’t change it every minute or two. And it would be cheaper to put it in read only memory than in core memory. And there was a whole series of wild read-only memories, both tried and built.
The first experimental Essex System had a thing called the flying spot store
which was large photographic plates with bits on them and CRTs projecting on the plates and you would detect underneath on the photodetector whether the bit was set or not. That was the program store of Essex. The program store of the first ESS systems consisted of twistors, which I actually am not sure I understand to this day, but they consist of iron wire with a copper wire wrapped around them and vice versa. There were also experiments with an IC type memory called the waffle iron. Then there was a period when magnetic bubbles were all the rage. As far as I know, although microelectronics made a lot of memory, most of the memory work at Bell Labs has not had much effect on ?...?. Nice tries though.</p>
<p class=pp><span style="font-size: 0.7em;">48:28</span>
Another thing that folks began to work on was the application of (and of course, right from the start) computers to data processing. When you owned equipment scattered through every street in the country, and you have a hundred million customers, and you have bills for a hundred million transactions a day, there’s really some big data processing going on. And indeed in the early 60s, AT&T was thinking of making its own data processing computers solely for billing. Somehow they pulled out of that, and gave all the technology to IBM, and one piece of that technology went into use in high end equipment called tractor tapes. Inch wide magnetic tapes that would be used for a while.</p>
<p class=pp><span style="font-size: 0.7em;">49:50</span>
By in large, although Bell Labs has participated until fairly recently in data processing in quite a big way, AT&T never really quite trusted the Labs to do it right because here is where the money is. I can recall one occasion when during strike of temporary employees, a fill-in employee like from the
Laboratories and so on, lost a day’s billing tape in Chicago. And that was a million dollars. And that’s generally speaking the money people did not until fairly recently trust Bell Labs to take good care of money, even though they trusted the Labs very well to make extremely reliable computing equipment for switches.
The downtime on switches is still spectacular by any industry standards. The design for the first ones was two hours down in 40 years, and the design was met. Great emphasis on reliability and redundancy, testing.</p>
<p class=pp><span style="font-size: 0.7em;">51:35</span>
Another branch of computing was for the government. The whole Whippany Laboratories [time check]
Whippany, where we took on contracts for the government particularly in the computing era in anti-missile defense, missile defense, and underwater sound. Missile defense was a very impressive undertaking. It was about in the early ’63 time frame when it was estimated the amount of computation to do a reasonable job of tracking incoming missiles would be 30 M floating point operations a second. In the day of the Cray that doesn’t sound like a great lot, but it’s more than your high end PCs can do. And the machines were supposed to be reliable. They designed the machines at Whippany, a twelve-processor multiprocessor, to no specs, enormously rugged, one watt transistors. This thing in real life performed remarkably well. There were sixty-five missile shots, tests across the Pacific Ocean ?...? and Lorinda Cherry here actually sat there waiting for them to come in. [laughter] And only a half dozen of them really failed. As a measure of the interest in reliability, one of them failed apparently due to processor error. Two people were assigned to look at the dumps, enormous amounts of telemetry and logging information were taken during these tests, which are truly expensive to run. Two people were assigned to look at the dumps. A year later they had not found the trouble. The team was beefed up. They finally decided that there was a race condition in one circuit. They then realized that this particular kind of race condition had not been tested for in all the simulations. They went back and simulated the entire hardware system to see if its a remote possibility of any similar cases, found twelve of them, and changed the hardware. But to spend over a year looking for a bug is a sign of what reliability meant.</p>
<p class=pp><span style="font-size: 0.7em;">54:56</span>
Since I’m coming up on the end of an hour, one could go on and on and on,</p>
<p class=pp>Crowd: go on, go on. [laughter]</p>
<p class=pp><span style="font-size: 0.7em;">55:10</span>
Doug: I think I’d like to end up by mentioning a few of the programs that have been written at Bell Labs that I think are most surprising. Of course there are lots of grand programs that have been written.</p>
<p class=pp>I already mentioned the block diagram compiler.</p>
<p class=pp>Another really remarkable piece of work was <a href="eqn.pdf">eqn</a>, the equation
typesetting language, which has been imitated since, by Lorinda Cherry and Brian Kernighan. The notion of taking an auditory syntax, the way people talk about equations, but only talk, this was not borrowed from any written notation before, getting the auditory one down on paper, that was very successful and surprising.</p>
<p class=pp>Another of my favorites, and again Lorinda Cherry was in this one, with Bob Morris, was typo. This was a program for finding spelling errors. It didn’t know the first thing about spelling. It would read a document, measure its statistics, and print out the words of the document in increasing order of what it thought the likelihood of that word having come from the same statistical source as the document. The words that did not come from the statistical source of the document were likely to be typos, and now I mean typos as distinct from spelling errors, where you actually hit the wrong key. Those tend to be off the wall, whereas phonetic spelling errors you’ll never find. And this worked remarkably well. Typing errors would come right up to the top of the list. A really really neat program.</p>
<p class=pp><span style="font-size: 0.7em;">57:50</span>
Another one of my favorites was by Brenda Baker called <a href="http://doi.acm.org/10.1145/800168.811545">struct</a>, which took Fortran programs and converted them into a structured programming language called Ratfor, which was Fortran with C syntax. This seemed like a possible undertaking, like something you do by the seat of the pants and you get something out. In fact, folks at Lockheed had done things like that before. But Brenda managed to find theorems that said there’s really only one form, there’s a canonical form into which you can structure a Fortran program, and she did this. It took your Fortran program, completely mashed it, put it out perhaps in almost certainly a different order than it was in Fortran connected by GOTOs, without any GOTOs, and the really remarkable thing was that authors of the program who clearly knew the way they wrote it in the first place, preferred it after it had been rearranged by Brendan. I was astonished at the outcome of that project.</p>
<p class=pp><span style="font-size: 0.7em;">59:19</span>
Another first that happened around here was by Fred Grampp, who got interested in computer security. One day he decided he would make a program for sniffing the security arrangements on a computer, as a service: Fred would never do anything crooked. [laughter] This particular program did a remarkable job, and founded a whole minor industry within the company. A department was set up to take this idea and parlay it, and indeed ever since there has been some improvement in the way computer centers are managed, at least until we got Berkeley Unix.</p>
<p class=pp><span style="font-size: 0.7em;">60:24</span>
And the last interesting program that I have time to mention is one by <a href="http://www.cs.jhu.edu/~kchurch/">Ken Church</a>. He was dealing with -- text processing has always been a continuing ?...? of the research, and in some sense it has an application to our business because we’re handling speech, but he got into consulting with the department in North Carolina that has to translate manuals. There are millions of pages of manuals in the Bell System and its successors, and ever since we’ve gone global, these things had to get translated into many languages.</p>
<p class=pp><span style="font-size: 0.7em;">61:28</span>
To help in this, he was making tools which would put up on the screen, graphed on the screen quickly a piece of text and its translation, because a translator, particularly a technical translator, wants to know, the last time we mentioned this word how was it translated. You don’t want to be creative in translating technical text. You’d like to be able to go back into the archives and pull up examples of translated text. And the neat thing here is the idea for how do you align texts in two languages. You’ve got the original, you’ve got the translated one, how do you bring up on the screen, the two sentences that go together? And the following scam worked beautifully. This is on western languages. <span style="font-size: 0.7em;">62:33</span>
Simply look for common four letter tetragrams, four letter combinations between the two and as best as you can, line them up as nearly linearly with the lengths of the two types as possible. And this <a href="church-tetragram.pdf">very simple idea</a> works like storm. Something for nothing. I like that.</p>
<p class=pp><span style="font-size: 0.7em;">63:10</span>
The last thing is one slogan that sort of got started with Unix and is just rife within the industry now. Software tools. We were making software tools in Unix before we knew we were, just like the Molière character was amazed at discovering he’d been speaking prose all his life. [laughter] But then <a href="http://www.amazon.com/-/dp/020103669X">Kernighan and Plauger</a> came along and christened what was going on, making simple generally useful and compositional programs to do one thing and do it well and to fit together. They called it software tools, made a book, wrote a book, and this notion now is abroad in the industry. And it really did begin all up in the little attic room where you [points?] sat for many years writing up here.</p>
<p class=pp> Oh I forgot to. I haven’t used any slides. I’ve brought some, but I don’t like looking at bullets and you wouldn’t either, and I forgot to show you the one exhibit I brought, which I borrowed from Bob Kurshan. When Bell Labs was founded, it had of course some calculating machines, and it had one wonderful computer. This. That was bought in 1918. There’s almost no other computing equipment from any time prior to ten years ago that still exists in Bell Labs. This is an <a href="http://infolab.stanford.edu/pub/voy/museum/pictures/display/2-5-Mechanical.html">integraph</a>. It has two styluses. You trace a curve on a piece of paper with one stylus and the other stylus draws the indefinite integral here. There was somebody in the math department who gave this service to the whole company, with about 24 hours turnaround time, calculating integrals. Our recent vice president Arno Penzias actually did, he calculated integrals differently, with a different background. He had a chemical balance, and he cut the curves out of the paper and weighed them. This was bought in 1918, so it’s eighty years old. It used to be shiny metal, it’s a little bit rusty now. But it still works.</p>
<p class=pp><span style="font-size: 0.7em;">66:30</span>
Well, that’s a once over lightly of a whole lot of things that have gone on at Bell Labs. It’s just such a fun place that one I said I just could go on and on. If you’re interested, there actually is a history written. This is only one of about six volumes, <a href="http://www.amazon.com/gp/product/0932764061">this</a> is the one that has the mathematical computer sciences, the kind of things that I’ve mostly talked about here. A few people have copies of them. For some reason, the AT&T publishing house thinks that because they’re history they’re obsolete, and they stopped printing them. [laughter]</p>
<p class=pp>Thank you, and that’s all.</p></p>
Using Uninitialized Memory for Fun and Profit
tag:research.swtch.com,2012:research.swtch.com/sparse
2008-03-14T00:00:00-04:00
2008-03-14T00:00:00-04:00
An unusual but very useful data structure
<p><p class=lp>
This is the story of a clever trick that's been around for
at least 35 years, in which array values can be left
uninitialized and then read during normal operations,
yet the code behaves correctly no matter what garbage
is sitting in the array.
Like the best programming tricks, this one is the right tool for the
job in certain situations.
The sleaziness of uninitialized data
access is offset by performance improvements:
some important operations change from linear
to constant time.
</p>
<p class=pp>
Alfred Aho, John Hopcroft, and Jeffrey Ullman's 1974 book
<i>The Design and Analysis of Computer Algorithms</i>
hints at the trick in an exercise (Chapter 2, exercise 2.12):
</p>
<blockquote>
Develop a technique to initialize an entry of a matrix to zero
the first time it is accessed, thereby eliminating the <i>O</i>(||<i>V</i>||<sup>2</sup>) time
to initialize an adjacency matrix.
</blockquote>
<p class=lp>
Jon Bentley's 1986 book <a href="http://www.cs.bell-labs.com/cm/cs/pearls/"><i>Programming Pearls</i></a> expands
on the exercise (Column 1, exercise 8; <a href="http://www.cs.bell-labs.com/cm/cs/pearls/sec016.html">exercise 9</a> in the Second Edition):
</p>
<blockquote>
One problem with trading more space for less time is that
initializing the space can itself take a great deal of time.
Show how to circumvent this problem by designing a technique
to initialize an entry of a vector to zero the first time it is
accessed. Your scheme should use constant time for initialization
and each vector access; you may use extra space proportional
to the size of the vector. Because this method reduces
initialization time by using even more space, it should be
considered only when space is cheap, time is dear, and
the vector is sparse.
</blockquote>
<p class=lp>
Aho, Hopcroft, and Ullman's exercise talks about a matrix and
Bentley's exercise talks about a vector, but for now let's consider
just a simple set of integers.
</p>
<p class=pp>
One popular representation of a set of <i>n</i> integers ranging
from 0 to <i>m</i> is a bit vector, with 1 bits at the
positions corresponding to the integers in the set.
Adding a new integer to the set, removing an integer
from the set, and checking whether a particular integer
is in the set are all very fast constant-time operations
(just a few bit operations each).
Unfortunately, two important operations are slow:
iterating over all the elements in the set
takes time <i>O</i>(<i>m</i>), as does clearing the set.
If the common case is that
<i>m</i> is much larger than <i>n</i>
(that is, the set is only sparsely
populated) and iterating or clearing the set
happens frequently, then it could be better to
use a representation that makes those operations
more efficient. That's where the trick comes in.
</p>
<p class=pp>
Preston Briggs and Linda Torczon's 1993 paper,
“<a href="http://citeseer.ist.psu.edu/briggs93efficient.html"><b>An Efficient Representation for Sparse Sets</b></a>,”
describes the trick in detail.
Their solution represents the sparse set using an integer
array named <code>dense</code> and an integer <code>n</code>
that counts the number of elements in <code>dense</code>.
The <i>dense</i> array is simply a packed list of the elements in the
set, stored in order of insertion.
If the set contains the elements 5, 1, and 4, then <code>n = 3</code> and
<code>dense[0] = 5</code>, <code>dense[1] = 1</code>, <code>dense[2] = 4</code>:
</p>
<center>
<img src="http://research.swtch.com/sparse0.png" />
</center>
<p class=pp>
Together <code>n</code> and <code>dense</code> are
enough information to reconstruct the set, but this representation
is not very fast.
To make it fast, Briggs and Torczon
add a second array named <code>sparse</code>
which maps integers to their indices in <code>dense</code>.
Continuing the example,
<code>sparse[5] = 0</code>, <code>sparse[1] = 1</code>,
<code>sparse[4] = 2</code>.
Essentially, the set is a pair of arrays that point at
each other:
</p>
<center>
<img src="http://research.swtch.com/sparse0b.png" />
</center>
<p class=pp>
Adding a member to the set requires updating both of these arrays:
</p>
<pre class=indent>
add-member(i):
dense[n] = i
sparse[i] = n
n++
</pre>
<p class=lp>
It's not as efficient as flipping a bit in a bit vector, but it's
still very fast and constant time.
</p>
<p class=pp>
To check whether <code>i</code> is in the set, you verify that
the two arrays point at each other for that element:
</p>
<pre class=indent>
is-member(i):
return sparse[i] < n && dense[sparse[i]] == i
</pre>
<p class=lp>
If <code>i</code> is not in the set, then <i>it doesn't matter what <code>sparse[i]</code> is set to</i>:
either <code>sparse[i]</code>
will be bigger than <code>n</code> or it will point at a value in
<code>dense</code> that doesn't point back at it.
Either way, we're not fooled. For example, suppose <code>sparse</code>
actually looks like:
</p>
<center>
<img src="http://research.swtch.com/sparse1.png" />
</center>
<p class=lp>
<code>Is-member</code> knows to ignore
members of sparse that point past <code>n</code> or that
point at cells in <code>dense</code> that don't point back,
ignoring the grayed out entries:
<center>
<img src="http://research.swtch.com/sparse2.png" />
</center>
<p class=pp>
Notice what just happened:
<code>sparse</code> can have <i>any arbitrary values</i> in
the positions for integers not in the set,
those values actually get used during membership
tests, and yet the membership test behaves correctly!
(This would drive <a href="http://valgrind.org/">valgrind</a> nuts.)
</p>
<p class=pp>
Clearing the set can be done in constant time:
</p>
<pre class=indent>
clear-set():
n = 0
</pre>
<p class=lp>
Zeroing <code>n</code> effectively clears
<code>dense</code> (the code only ever accesses
entries in dense with indices less than <code>n</code>), and
<code>sparse</code> can be uninitialized, so there's no
need to clear out the old values.
</p>
<p class=pp>
This sparse set representation has one more trick up its sleeve:
the <code>dense</code> array allows an
efficient implementation of set iteration.
</p>
<pre class=indent>
iterate():
for(i=0; i<n; i++)
yield dense[i]
</pre>
<p class=pp>
Let's compare the run times of a bit vector
implementation against the sparse set:
</p>
<center>
<table>
<tr>
<td><i>Operation</i>
<td align=center width=10>
<td align=center><i>Bit Vector</i>
<td align=center width=10>
<td align=center><i>Sparse set</i>
</tr>
<tr>
<td>is-member
<td>
<td align=center><i>O</i>(1)
<td>
<td align=center><i>O</i>(1)
</tr>
<tr>
<td>add-member
<td>
<td align=center><i>O</i>(1)
<td>
<td align=center><i>O</i>(1)
</tr>
<tr>
<td>clear-set
<td><td align=center><i>O</i>(<i>m</i>)
<td><td align=center><i>O</i>(1)
</tr>
<tr>
<td>iterate
<td><td align=center><i>O</i>(<i>m</i>)
<td><td align=center><i>O</i>(<i>n</i>)
</tr>
</table>
</center>
<p class=lp>
The sparse set is as fast or faster than bit vectors for
every operation. The only problem is the space cost:
two words replace each bit.
Still, there are times when the speed differences are enough
to balance the added memory cost.
Briggs and Torczon point out that liveness sets used
during register allocation inside a compiler are usually
small and are cleared very frequently, making sparse sets the
representation of choice.
</p>
<p class=pp>
Another situation where sparse sets are the better choice
is work queue-based graph traversal algorithms.
Iteration over sparse sets visits elements
in the order they were inserted (above, 5, 1, 4),
so that new entries inserted during the iteration
will be visited later in the same iteration.
In contrast, iteration over bit vectors visits elements in
integer order (1, 4, 5), so that new elements inserted
during traversal might be missed, requiring repeated
iterations.
</p>
<p class=pp>
Returning to the original exercises, it is trivial to change
the set into a vector (or matrix) by making <code>dense</code>
an array of index-value pairs instead of just indices.
Alternately, one might add the value to the <code>sparse</code>
array or to a new array.
The relative space overhead isn't as bad if you would have been
storing values anyway.
</p>
<p class=pp>
Briggs and Torczon's paper implements additional set
operations and examines performance speedups from
using sparse sets inside a real compiler.
</p></p>
Play Tic-Tac-Toe with Knuth
tag:research.swtch.com,2012:research.swtch.com/tictactoe
2008-01-25T00:00:00-05:00
2008-01-25T00:00:00-05:00
The only winning move is not to play.
<p><p class=lp>Section 7.1.2 of the <b><a href="http://www-cs-faculty.stanford.edu/~knuth/taocp.html#vol4">Volume 4 pre-fascicle 0A</a></b> of Donald Knuth's <i>The Art of Computer Programming</i> is titled “Boolean Evaluation.” In it, Knuth considers the construction of a set of nine boolean functions telling the correct next move in an optimal game of tic-tac-toe. In a footnote, Knuth tells this story:</p>
<blockquote><p class=lp>This setup is based on an exhibit from the early 1950s at the Museum of Science and Industry in Chicago, where the author was first introduced to the magic of switching circuits. The machine in Chicago, designed by researchers at Bell Telephone Laboratories, allowed me to go first; yet I soon discovered there was no way to defeat it. Therefore I decided to move as stupidly as possible, hoping that the designers had not anticipated such bizarre behavior. In fact I allowed the machine to reach a position where it had two winning moves; and it seized <i>both</i> of them! Moving twice is of course a flagrant violation of the rules, so I had won a moral victory even though the machine had announced that I had lost.</p></blockquote>
<p class=lp>
That story alone is fairly amusing. But turning the page, the reader finds a quotation from Charles Babbage's <i><a href="http://onlinebooks.library.upenn.edu/webbin/book/lookupid?key=olbp36384">Passages from the Life of a Philosopher</a></i>, published in 1864:</p>
<blockquote><p class=lp>I commenced an examination of a game called “tit-tat-to” ... to ascertain what number of combinations were required for all the possible variety of moves and situations. I found this to be comparatively insignificant. ... A difficulty, however, arose of a novel kind. When the automaton had to move, it might occur that there were two different moves, each equally conducive to his winning the game. ... Unless, also, some provision were made, the machine would attempt two contradictory motions.</p></blockquote>
<p class=lp>
The only real winning move is not to play.</p></p>
Crabs, the bitmap terror!
tag:research.swtch.com,2012:research.swtch.com/crabs
2008-01-09T00:00:00-05:00
2008-01-09T00:00:00-05:00
A destructive, pointless violation of the rules
<p><p class=lp>Today, window systems seem as inevitable as hierarchical file systems, a fundamental building block of computer systems. But it wasn't always that way. This paper could only have been written in the beginning, when everything about user interfaces was up for grabs.</p>
<blockquote><p class=lp>A bitmap screen is a graphic universe where windows, cursors and icons live in harmony, cooperating with each other to achieve functionality and esthetics. A lot of effort goes into making this universe consistent, the basic law being that every window is a self contained, protected world. In particular, (1) a window shall not be affected by the internal activities of another window. (2) A window shall not be affected by activities of the window system not concerning it directly, i.e. (2.1) it shall not notice being obscured (partially or totally) by other windows or obscuring (partially or totally) other windows, (2.2) it shall not see the <i>image</i> of the cursor sliding on its surface (it can only ask for its position).</p>
<p class=pp>
Of course it is difficult to resist the temptation to break these rules. Violations can be destructive or non-destructive, useful or pointless. Useful non-destructive violations include programs printing out an image of the screen, or magnifying part of the screen in a <i>lens</i> window. Useful destructive violations are represented by the <i>pen</i> program, which allows one to scribble on the screen. Pointless non-destructive violations include a magnet program, where a moving picture of a magnet attracts the cursor, so that one has to continuously pull away from it to keep working. The first pointless, destructive program we wrote was <i>crabs</i>.</p>
</blockquote>
<p class=lp>As the crabs walk over the screen, they leave gray behind, “erasing” the apps underfoot:</p>
<blockquote><img src="http://research.swtch.com/crabs1.png">
</blockquote>
<p class=lp>
For the rest of the story, see Luca Cardelli's “<a style="font-weight: bold;" href="http://lucacardelli.name/Papers/Crabs.pdf">Crabs: the bitmap terror!</a>” (6.7MB). Additional details in “<a href="http://lucacardelli.name/Papers/Crabs%20%28History%20and%20Screen%20Dumps%29.pdf">Crabs (History and Screen Dumps)</a>” (57.1MB).</p></p>