[ next | prev | up ]

Leaping Years and Drawing Lines

Some of the earliest mathematics was invented to design calendars. Since the Earth rotates around its axis about 365.25 times per revolution around the sun, some years in a solar calendar need to be 365 days long and others 366 days long in order to keep the calendar in sync with the sun. The key challenge in designing a calendar is to find a pattern of short and long years that achieve the desired approximation. The Julian calendar uses the simple pattern of one leap year every four years. The Islamic calendar, which syncs against the moon, needs to have years approximating 354.37 days per year. To do this, it inserts 11 leap years over a 30 year cycle.

Bresenham's algorithm chooses which pixels to use when drawing a bitmap representation of a line. The basic idea, for a mostly horizontal line that angles upward, is to move left to right, one pixel at a time, moving upward when the distance between the actual y and the pixel-approximate y has grown too large.

drawline(dx, dy)    /* mostly horizontal line, upward line */
    x = 0
    y = 0
    error = 0
    while (x < dx || y < dy)
        drawpoint(x, y);
        x++;
        if (x*(dy/dx) - y > 0.5)
            y++;
    drawpoint(x, y);

For example, drawing a line 30 pixels long and 12 pixels high produces

In their 2004 paper “Line Drawing, Leap Years, and Euclid” (PDF), Mitchell Harris and Edward Reingold show that the problem of choosing a leap year pattern is a generalized version of Bresenham's line drawing problem. This is surprising, but makes intuitive sense when you realize that both are concerned with computing an integer approximation to a rational fraction as evenly as possible. Leap year computations are basically drawing a line with slope 1/365.25 (or 1/364.37): each pixel is a day, and its y coordinate determines which year it belongs to.

Harris and Reingold go on to show that Euclid's algorithm can compute leap year patterns! Euclid's algorithm usually computes the greatest common divisor of two numbers:

gcd(u, v)
    while (u != v)
        if (u > v)
            u = u mod v;
        else
            v = v mod u;
    return u;

The connection is that if you're trying to draw a line dx by dy but dx and dy have some common divisor d, then the Bresenham algorithm will simply emit the pattern for a line dx/d by dy/d, d times. The example shown above can be viewed as two 15x6 lines, three 10x4 lines, or six 5x2 lines:

Euclid's algorithm gives the largest possible d, and you can adapt the algorithm to build the associated repeating pattern while it computes d.

In Volume 2, Donald Knuth calls it “the granddaddy of all algorithms, because it is the oldest nontrivial algorithm that has survived to the present day.” It also seems to have some famous children.


Further reading:

Bresenham described his algorithm in his 1965 paper “Algorithm for computer control of a digital plotter

Godfried Toussaint's 2005 paper “The Euclidean algorithm generates traditional musical rhythms” shows yet another pattern connection, this time to rhythms.

The Gregorian calendar is an evolved version of the Julian calendar and can't be viewed as a Bresenham line. However, Jeffrey Shallit's 1994 paper “Pierce expansions and rules for the determination of leap years” shows how to compute leap years using Pierce expansions, and this model can generate the Gregorian calendar too.

Edward Reingold and Nashum Dershowitz are the experts on calendar algorithms. Their pair of papers 1990 Calendrical Calculations and 1993 Calendrical Calculations, II: Three Historical Calendars gives algorithms (and Lisp code) for essentially any date-related computation you might want to do. They expanded upon these papers in their books Calendrical Calculations: The Millennium Edition and Calendrical Tabulations: 1900-2200.

Face the Nation

The Hideous Name” makes mention of a face server, perhaps the first user-level file system. The face server's main purpose was to provide faces for vismon, which showed a face for each message in a user's (electronic) mail box. Vismon is described in Rob Pike and Dave Presotto's 1985 Usenix paper “Face the Nation.”

(The appearance of grey scale is due only to the image being shrunk. Click on the image for the full-size version.) This example shows the time, system load, a graph of cpu time, faces for the five most recent emails, and the names of the two most recent senders.

The AT&T 5620 Retrocomputing page contains a file tree used in the commercially distributed system, including the face set:

(The mailbox with the tail, labeled md, is actually mailer-daemon, but that label didn't fit.)

The faces include Brian Kernighan (bwk), Dennis Ritchie (dmr), Ken Thompson (ken), and Rob Pike (rob). Curiously, Peter Weinberger's face is missing. It was featured prominently in the paper and in many other contexts, including water towers. It is also used as the default “unknown” icon in vismon. Here it is in its two vismon instantiations:

The spirit of vismon lives on in Plan 9's faces (née seemail), Unix's faces, and OS X's MailGlance, among others.

Vismon also inspired in 1987 the creation of the Usenix FaceSaver project. Digging around in the archives you can find pictures of luminaries such as Kirk McKusick, Dennis Ritchie, Henry Spencer, and Larry Wall.

The Picons Archive has even more pictures.

Permissive Access Links

In 2004, Steve Bellovin gave a talk at Usenix Security speculating about permissive access links (PALs), the (supposedly impossible to bypass) locks that protect nuclear weapons. He repeated the talk in 2006 at the general Usenix. In Bellovin's talk, “Nuclear Weapons, Permissive Action Links, and the History of Public Key Cryptography” (MP3; also PDF and HTML), he says that “Bypassing a PAL should be, as one weapons designer graphically put it, about as complex as performing a tonsillectomy while entering the patient from the wrong end.” But how do they work? Are there lessons that apply to building other kinds of secure systems? He touches on these questions, but in the end, it's mostly speculation. Even so, it's a fascinating talk.

He does tease out a few interesting historical details. In particular, National Security Action Memorandum 160, signed by President Kennedy, has been claimed by former NSA insiders to be the impetus for the NSA's invention of public key cryptography. There is no evidence that public key cryptography ended up being used in PALs, but it's possible that digital signatures were invented in direct response to the requirement that, after a weapon was launched, it be possible to determine who authorized the launch. It's also possible that public key cryptography was invented and used to transmit the PAL codes securely.

Other interesting facts. The U.S. offered PALs to the Soviets (presumably to keep weapons from falling into other hands), but they turned them down. For years after the initial U.S. PAL deployments, the launch codes were all set to 00000000. The bandwidth of the extra-long frequency extremely low frequency (ELF) communication link to submerged submarines is 1 bit/minute.

Elegance and Power

These are the most elegant programs I've ever seen.

A power series is an infinite-degree polynomial. For example,


e^x = 1 + x + \frac{x^2}{2!} + \frac{x^3}{3!} + \frac{x^4}{4!} + \cdots

Computation and manipulation of power series has a long and distinguished history as a test of so-called stream processing systems, which manipulate (arbitrarily long finite prefixes of) infinite streams. In the 1970s, Gilles Kahn and David MacQueen used power series as an unpublished test of their coroutine-based stream-processing system. Hal Abelson and Gerald Sussman devote Section 3.5 of their well-known 1985 textbook Structure and Interpretation of Computer Programs to stream processing, although only a single exercise mentions power series. In the late 1980s, Doug McIlroy explored power series processing in the context of Rob Pike's concurrent programming language Newsqueak. He published the work in his 1990 paper “Squinting at Power Series” (the Newsqueak interpreter is named squint).

Stream processing requires lazy evaluation to keep computations finite. All of these implementations built frameworks for lazy evaluation atop other building blocks: concurrent processes (Kahn and MacQueen, McIlroy) or first-class functions (Abelson and Sussman). It's only natural, then, to consider what the implementations would look like in a language with explicit support and encouragement for lazy evaluation, a language like Haskell. McIlroy revisited the topic in the context of Haskell in his 1998 Functional Pearl, “Power Series, Power Serious” (gzipped PS).

Symbolic manipulations can represent a power series as just a list of integer coefficients. Mathematically, this is equivalent to writing the polynormial in Horner form, which repeatedly factors x out of the remainder of the polynomial: F = f0 + xF1, or in Haskell, (f:fs) (= f + x fs).



In this form, ex is [1, 1, 1%2, 1%6, 1%24, 1%120, ...].

As a simple example, McIlroy's addition and multiplication implementations mimic the usual rules:


(f0 + xF1) + (g0 + xG1) = (f0 + g0) + x(F1 + G1)


  (f:fs) + (g:gs) = f+g : fs+gs

(f0 + xF1) (g0 + xG1) = f0g0 + x(f0G1 + g0F1) + x2(F1G1)


{- (f:fs) * (g:gs) = (f*g : 0) + (0 : f.*gs + g.*fs) + (0 : 0 : fs*gs) -}
   (f:fs) * (g:gs) = f*g : (f.*gs + g.*fs + (0 : fs*gs))
       c .* (f:fs) = c.*f : c.*fs

The commented-out definition is a more literal, but less efficient, translation of the equation.

Haskell's pattern matching and operator overloading are responsible for the elegance of the presentation, but it is lazy evaluation that is responsible for the elegance of the computation. The underlying lazy computation mechanism (whether provided directly, as in Haskell, or via other primitives, as in Scheme and Newsqueak) takes care of all the bookkeeping necessary to compute only the terms needed.

McIlroy implements derivative and integral operators as well. Both rely on a helper function to keep track of the leading multiplicative constant:


deriv (f:fs) = (deriv1 fs 1) where
    deriv1 (g:gs) n = n*g : (deriv1 gs (n+1))

integral fs = 0 : (int1 fs 1) where
    int1 (g:gs) n = g/n : (int1 gs (n+1))

These definitions enable elegant definitions of the power series for exp, sin, and cos:


expx = 1 + (integral expx)

sinx = integral cosx
cosx = 1 - (integral sinx)

To me, these three equations programs are beautifully elegant, almost magical.

When learning recursion in an introductory programming course (or induction in an introductory math course), it is common to feel like there's no actual work going on: one case just got rewritten in terms of others. The base cases, of course, provide the foundation on which the others are built. Learning to determine which base cases are necessary given the recursive steps is the key to being comfortable with recursion. Only then can you see that the program or proof is structurally sound.

The recursion in the definitions above gives me the same introductary queasiness, because it is hard to see the base cases. Upon closer inspection, the base case is in the expansion of integral, which begins with a constant zero term, making it possible to determine the first term in each of those series without further recursion. Having the first term makes it possible to find the second term, and so on.

For me, understanding why they work only enhances the beauty of these programs.

McIlroy continues the theme in his 2000 ICFP paper “The Music of Streams” (gzipped PS), which contains even more examples but fewer programs.

I've extracted runnable code from the paper. McIlroy maintains equivalent code. McIlroy's version uses these shorter versions of integral and deriv, which will delight Haskell aficionados:


int fs = 0 : zipWith (/) fs [1..]
diff (_:ft) = zipWith (*) ft [1..]

For the numerical computing aficionados, there is a one-line implementation of power series reversion:


revert (0:ft) = rs where rs = 0 : 1/(ft#rs)

(For comparison, Donald Knuth devotes Section 4.7 of Volume 2, about 10 pages, to the manipulation of power series. The equivalent reversion implementation takes about half a page.)

Worms and Distributed Computations

John Shoch and Jon Hupp's 1982 paper “The ‘Worm’ Programs—Early Experience with a Distributed Computation” (PDF, also HTML) discusses a fascinating series of distributed computation experiments carried out on a network of Altos at Xerox PARC. An earlier post discussed the Alto file system.

The Alto had memory to run only one program at a time and lacked virtual memory. It multitasked by swapping one program (and often much of the operating system) completely to disk and then loading a different one. Such a “world swap” took about two seconds. Because of this usage model, users tended to close their programs before going home, leaving the machine idle for other uses at night. Because they came out at night, the programs were sometimes called “vampire programs.”

Shoch and Hupp called their programs “worms” after the electronic tapeworm created in John Brunner's 1975 book The Shockwave Rider*, although the distributed computation ideas probably predate the book. (They lament that the phrase “distributed computing” “has already been co-opted by those who market fairly ordinary terminal systems,” although, at least in my experience, the phrase seems to have been restored to its original meaning.)

The paper discusses a handful of possible distributed computations, ranging from a simple “hello,world” to a multimachine animation computation. Perhaps the most interesting part is Section 5, which gives some of the history of multimachine programs on the Arpanet (the predecessor of the Internet). Of course, there are the network's routers themselves; maintenance of routing tables is almost certainly the longest-running distributed computation in history. The paper gives a few other examples, most notably a multimachine air traffic control simulation called McRoss (multicomputer route oriented simulation system). McRoss partitioned the air space among the computers running the simulations; particular planes were handed off from one computer to another as they flew around the air space.

The summary concludes:

This summary is probably not complete or fully accurate, but it is an impressive collection of distributed computations, produced within or on top of the Arpanet. Much of this work, however, was done in the early 70s; one participant recently commented, “It's hard for me to believe that this all happened seven years ago.” Since that time, we have not witnessed the anticipated blossoming of many distributed applications using the long-haul capabilities of the Arpanet.

In 2008, it's still hard to believe that this all happened 30 years ago. In the past decade, such distributed computations have really taken off (SETI@home started in 1999). Today, BOINC and other projects enlist the help of millions of computers to attack problems ranging from chess to malaria.


* The Shockwave Rider also contains this fantastically succinct introduction to the paradox of orbital mechanics:

You are in circular orbit around a planet. You are being overtaken by another object, also in circular orbit, moving several km./sec. faster. You accelerate to try and catch up.

See you later, accelerator.

Much later.

The idea makes repeated appearances. Other wordings include “Go faster in order to drop back to a lower orbit? Doesn't work. Drop back to a lower orbit; you go faster!” and “The way to go faster is to slow down.”

The Hideous Name

In 1985, email addresses looked a lot different than they do today. They were full of now-foreign symbols like : ! and %, because each mailer had its own syntax and there was no translation between syntaxes at borders, resulting in addresses like

research!ucbvax!@cmu-cs-pt.arpa:@CMU-ITC-LINUS:dave%CMU-ITC-LINUS@CMU-CS-PT.

Rob Pike and Peter Weinberger used that address as the opening example in their paper “The Hideous Name,” which examines good and bad examples of naming in both file systems and mail addresses. Name spaces, they argue, are most powerful when new semantics can be added without adding new syntax.

The mail examples in “The Hideous Name” are not particularly relevant anymore. Everyone has standardized on the two-level ARPANET @ syntax, and the other syntaxes are used only by people looking for open mail relays. But the file system examples are more relevant today than they were 20 years ago.

Unix file names are the canonical example: a program that parses file names like /etc/passwd will have no problem parsing names of ‘special’ files like /dev/stdin. The latter is special only in semantics, not in syntax: if you run cp /dev/stdin /tmp, cp will have no trouble deciding the name of the target file (/tmp/stdin). Contrast this with the convention of using to mean standard input: what target should cp – /tmp use? In fact (on Linux, at least) it looks for a file named rather than using standard input, illustrating another problem. Special syntax is often implemented by individual programs rather than the operating system, resulting in non-uniform handling of the names. It's hard to justify why (again, on Linux) cat – >/tmp/a means something different from cp – /tmp/a.

Providing uniform interpretation of names means having some central arbiter, tradtionally the kernel. Then all programs use the same name space, and one can apply old tools to new resources. For example, if other machines' file systems were mounted on /n/system/,

sed 's!:.*!!' /n/*/etc/passwd | sort -u 

produces a list of all the user of all the machines. And if backups are mounted on /dump,

ls -l /dump/am/2007/02*/etc/passwd

shows all the password files from last February. Special semantics, but no special syntax.

Better than changing the kernel for each file system is having a general mechanism by which the kernel can defer interpretation of particular file trees to other programs. Plan 9 was the first system to make extensive use of such user-level file servers. FUSE now makes it possible to do the same on modern Unixes. For example, Amit Singh has built some interesting new file systems for OS X using MacFUSE.

Pike and Weinberger end their paper with an observation from Doug McIlroy:

bad notations can stifle progress. Roman numerals hobbled mathematics for a millennium but were propagated by custom and by natural deference to authority. Today we no longer meekly accept individual authority. Instead, we have “standards,” impersonal imprimaturs on convention. Some standards are sound and indispensable; some simply celebrate bureaucratic littleness of mind. A harvest of gimmicks to save appearances within the standard has grown up, then gimmicks to save the appearances within the appearances. You know how each one got there: an overnight hack to paste another tumor onto a wild cancerous growth. The concern was with method, regardless of results. The result is extravagantly worse than Roman numerals: you can't read the notation right to left or left to right. As an amalgam of languages, it can't be deciphered by a native speaker of any one of them, much as if we were to switch at random places in a number between Roman and Arabic signs and between big-endian and little-endian order. But now that it all “works” — at least for the strong of stomach — the tumors themselves are being standardized.

What bad notations are stifling your progress?

Bourne Shell Macros

Steve Bourne* worked on an Algol 68 compiler at the Cambridge University Computer Lab before moving to Bell Labs, where he worked on Seventh Edition Unix. He is best known for writing V7's /bin/sh, now known as the Bourne shell.

Bourne's preference for Algol syntax is evident in shell scripts, which use Algol constructs like if ... then ... elif ... else ... fi and case ... esac instead of C equivalents that might be more familiar to Unix users. (The C shell and rc were both inspired, in part, by a desire for C-like syntax.)

Bourne's preference for Algol syntax is, surprisingly, also evident in the shell source. More comfortable with Algol but stuck with C, Bourne found a way to cope:

/usr/src/cmd/sh/mac.h:
#define IF      if(
#define THEN    ){
#define ELSE    } else {
#define ELIF    } else if (
#define FI      ;}

#define BEGIN   {
#define END     }
#define SWITCH  switch(
#define IN      ){
#define ENDSW   }
#define FOR     for(
#define WHILE   while(
#define DO      ){
#define OD      ;}
#define REP     do{
#define PER     }while(
#define DONE    );
#define LOOP    for(;;){
#define POOL    }

Using these macros, one can write programs that would certainly feel familiar to an Algol programmer:

/usr/src/cmd/sh/service.c:297:
LOCAL VOID      gsort(from,to)
        STRING          from[], to[];
{
        INT             k, m, n;
        REG INT         i, j;

        IF (n=to-from)<=1 THEN return FI

        FOR j=1; j<=n; j*=2 DONE

        FOR m=2*j-1; m/=2;
        DO  k=n-m;
            FOR j=0; j=0; i-=m
                DO  REG STRING *fromi; fromi = &from[i];
                    IF cf(fromi[m],fromi[0])>0
                    THEN break;
                    ELSE STRING s; s=fromi[m]; fromi[m]=fromi[0]; fromi[0]=s;
                    FI
                OD
            OD
        OD
}

(It's not clear why Bourne used braces around function bodies instead of BEGIN and END.)

The Bourne shell code, along with the 4.2BSD finger implementation, was the impetus for the International Obfuscated C Code Contest.

* Really, Stephen R. Bourne, but even then not quite specific enough. Years ago, I heard a story that at one point there were two Stephen R. Bournes working at SGI. Callers would ask the operator for Steve Bourne and would be asked, ‘which one?’ ‘Steven R. Bourne,’ they'd say. ‘Sure, which one?’ The ambiguity was apparently resolved by referring to them as ‘hardware Steve’ and ‘software Steve.’ I can find no justification for this story nor any trace of ‘hardware Steve’ other than the mention of two Steven R. Bournes in the sendmail README, which claims that they were both at Bell Labs, not SGI. If you can confirm or deny this story, please post a comment or mail me (rsc@swtch.com).

Traveling Passenger Problems

(Today's late post is brought to you by Continental Airlines and the letters E, W, and R.)

ITA Software's flagship product, QPX, is an airline ticket search engine. You tell it where you want to fly and when, and it comes back with possible trips at many different prices. QPX powers Orbitz, CheapTickets, Hotwire, and a half dozen airline company sites.

The airlines want to charge as much as possible for each ticket they sell, but still fill each plane. But the airlines can't just ask possible customers how much money they have to spend, and they also can't explicitly set different prices based on who is buying the ticket. So they attach restrictions to less expensive fares, hoping that business travelers won't accept the restrictions and will pay for the more expensive tickets. The most well-known example is that trips that include a Saturday night are cheaper. There are also changes in ticket prices based on the rest of the trip: a one-way ticket from Boston to Newark might cost $349, but a one-way ticket from Boston to Raleigh via the very same Newark flight only $145*. Of course, if you make a habit of buying multi-leg tickets and skipping the connecting flights, the airlines might blacklist you (or so a travel agent once told me).

These restrictions make QPX's job harder than you might think. In fact, the general case is undecidable!

ITA's Chief Scientist, Carl de Marcken, gives a talk titled “Computational Complexity of Airline Travel&rdquo (HTML, also PDF), in which he proves that:

1. Even if the airlines publish only a single fare (with every ticket a single one way PU), and all the airports in a passenger's itinerary are fixed, so that the only remaining choice is what flights to use between the consecutive airports, deciding whether there is a valid ticket is NP-hard.

2. Fixing the flights and priceable units, but leaving open the choice of fares to pay for each flight, deciding whether there is a valid ticket is NP-hard.

3. Removing bounds on the size of solutions, the general question of whether there is a ticket for a query is EXPSPACE-hard. That is, air travel planning is at least as hard (it might be harder) as deciding whether a computer program that can use space exponentially bigger than the input halts. EXPSPACE-hard problems are (thought to be) much harder than NP-complete problems like the traveling salesman problem, and even much harder than PSPACE-complete problems like playing games optimally. There is no practical hope that computers will ever be able to solve EXPSPACE-hard problems perfectly, even if quantum computing becomes a reality.

4. The final result shows that just finding out whether there is a valid solution for a query is actually harder than EXPSPACE-complete: it is unsolvable (undecidable). The question of whether a valid ticket exists can not be solved for all databases and all queries no matter how long a computer thinks. However the full proof of this result is considerably more complex than the EXPSPACE-hard proof without offering any greater understanding of the problem.

The NP-hardness proofs reduce 3-color, the EXPSPACE-completeness proof encodes a Turing machine in ticket restrictions, and the undecidability proof encodes Diophantine equations. Here's a 3x3-bit binary multiplier. (Be sure to read the early slides too, which give necessary details about ticket pricing.)


* As of February 13, Continental flights on March 12, 2008 had those prices. Actual fares may no longer match.

Absolute File System Design

Butler Lampson and Robert Sproull's 1979 SOSP paper “An Open Operating System for a Single-User Machine” (PDF; also HTML) describes the operating system on the Alto, a groundbreaking personal networked workstation built at Xerox PARC in the late 1970s and early 1980s. Many Alto features became everyday aspects of modern computing environments, but one that didn't catch on was the Alto's robust file system.

The Alto file system was designed to withstand disk or programming errors. If some set of disk blocks were lost, the Alto file system would only lose the data contained in those sectors. If you lose a data block from a file, you lose just that data block. If you lose a block in a directory, you lose the name for that file but not the file itself.

Section 3 of the paper gives the details. The basic technique is that each disk block contains a small header giving the id number and version of the associated file along with the block's offset in the file and the amount of data in the block (the last block in a file might contain less than a block worth of data). Lampson and Sproull refer to this information as absolute: it is the single point of truth for the block in question. A disk scavenger (like Unix's fsck) can scan the entire file system reading block labels to reconstruct the contents of any file.

Scanning the entire file system to find block contents is of course very slow. The block label also contains pointers to the previous and next block in a file, so that reading a file can just follow the block pointers (the file system layout algorithm arranged that much of the time files would be laid out sequentially so that following these pointers would not be too expensive in terms of disk seeks). These pointers are called hints, because they exist only for performance, not for correctness. If the hints are incorrect, that fact will become clear when they are used (the label of the pointed at block will not agree with expectations), and the scavenger can be invoked to fix the file system.

Contrast this scheme to inodes in a modern Unix file system (say, BSD FFS or Linux Ext2/3). If you lose the 512-byte disk block containing a 1GB file's inode, the entire file is inaccessible. No amount of scavenging can help, because the list of blocks making up the file was only contained in the inode, and the blocks themselves are not tagged with which file they belong to. This is much less robust than the Alto file system. Of course, the situation is even worse if you lose the 512-byte disk block containing the file system superblock: then you've lost the entire file system. Because the superblock is so crucial, file systems usually maintain backup copies of the superblock scattered across the disk. Inodes are almost as important but not so carefully guarded.

Why don't modern file systems don't use labels like the Alto did?

Disk geometry might play a role: it is hard to reconcile a (say) 32 byte label with (say) a 4096 byte data block. You put the label in the data block, cutting the actual data size to 4064 bytes. You could put the label in the sector before the block, taking up 512 bytes for a 32 byte label. Or you could write a label sector containing 16 labels every 16 data blocks. The third is probably the best option, but all of them are somewhat cumbersome. (But you'd lose less data!)

Another reason is historical: the original Unix file system didn't have any such mechanism, and FFS and Ext2/3 differ little from the original Unix file system.

Another reason might be philosophical: you're supposed to handle disk failures with backups, not clever file system tricks. This is the least useful argument: who wouldn't want a file system that needed to resort to backups less often?

Another reason might be pragmatic: maybe disks fail in different ways now than they did in 1979. (Of course, that doesn't explain the original Unix file system or FFS.) Since there's so little data on disk failures, it's hard to say anything for certain.

Whatever the arguments against labels, it is undeniable that they produced a very robust system. In fact, the operating system on the Alto could be almost completely swapped out an replaced with user programs, each of which could supply its own file system implementation. Over time, there came to be three file system implementations sharing the disk, one in BCPL, one in Mesa, and one in Smalltalk. In a retrospective paper, Lampson wrote:

If a disk address is wrong, the label check will detect the error, and various recovery mechanisms can be invoked. A Scavenger program, written by Jim Morris, takes about a minute to check or restore the consistency of an Alto file system; it can be routinely run by nonprofessional users. As a result of this conservative design, the file system has proved to be very reliably; loss of any information other than bits physically damaged on the disk is essentially unheard of. This reliability is achieved in spite of the fact that many programs besides the standard operating system have access to the disk.

Who wouldn't want a robust system like that? And why aren't more of today's file systems that robust?

B Languages

I enjoy reading about programming languages of the past. (I would probably enjoy reading about programming languages of the future too, but those are harder to find.) Today's links are a glimpse back into the programming world before types.

The predecessor of the C programming language was, of course, Ken Thompson's B, which was in turn influenced by Martin Richards's BCPL.*

Thompson's 1972 technical memorandum “Users' Reference to B” (PDF, 1MB; also HTML) describes B in detail. The most striking part about B is how similar it is to C. The fundamental difference is the lack of types, or rather the single type: the machine word. As Thompson puts it, “There is no check to insure that there are no type mismatches. Similarly, there are no wanted, or unwanted, type conversions.” To me, the lack of types gives B a flavor distinctly different from C. Having grown up programming with types, being able to dereference any value at all seems like not wearing a seat belt.

The BCPL manual is also online, courtesy of Dennis Ritchie. It is easy to see B as BCPL filtered through Thompson's minimalist sensibilities. In a 1989 interview, Thompson recounted, “[B] was very small, very clean. It was the same language as BCPL, it looked completely different, syntactically it was, you know, a redo. The semantics was exactly the same as BCPL. And in fact the syntax of it was, if you looked at, you didn’t look too close, you would say it was C. Because in fact it was C, without types.”

The B memo also contains other interesting trivia. Thompson defines the newfangled while loop in terms of goto, an indicator of just how uncommon ‘structured’ programming was at the time. The memo also contains what is probably the first published implementation of printf.

In a language whose only type is the machine word, character processing is particularly cumbersome. B included primitives char and lchar to fetch and store characters from character strings (vectors). On his BCPL history page linked above, Ritchie says that “C was invented in part to provide a plausible way of dealing with characters when one begins with a word-oriented language.”

B had the full suite of the =op operators (what became op= in modern C), not just for arithmetical operators like =+ and =- but also for comparison operators: ===, =!=, =<, =<=, =>, and =>=. Unfortunately, the end of the manual notes that none of these comparison operators were actually implemented, nor were =/ and =&.

Brian Kernighan's 1973 B tutorial (PDF; also HTML) contains what is probably the very first “hello, world” program. (See page 4 of the PDF.)


* There is an old story that at Bell Labs there were arguments over whether the successor to C should be called D (following the alphabet) or P (following the letters in BCPL) and that Bjarne Stroustroup ended the argument by choosing the name C++ instead. In Larry Wall's 1999 talk “Perl, the first postmodern computer language,” he said that Perl was the true successor to C, having taken both of the remaining letters from BCPL as its conventional script extension.

Traffic Lights and Buridan's Ass

Buridan's ass is a donkey in a thought experiment proposed by French philosopher Jean Buridan. Placed exactly halfway between two bales of hay, the purely rational, deterministic donkey starves to death trying to decide which to eat: there is no reason to prefer one bale over the other.

Arbiters, electronic circuits that decide whether a particular voltage is a binary 0 or 1, have the same problem: voltages in the middle can force arbitrarily long decision times. This is known as the arbiter problem or glitch phenomenon.

Leslie Lamport observed that the arbiter problem also explains human indecisiveness at critical moments. His paper, “Buridan's Principle,” attempts to make the phenomenon known outside of computer science. Lamport explains the paper's fate:

I have observed that the arbiter problem occurs in daily life. Perhaps the most common example is when I find myself unable to decide for a fraction of a second whether to stop for a traffic light that just turned yellow or to go through. I suspect that it is actually a cause of serious accidents, and that people do drive into telephone poles because they can't decide in time whether to go to the left or the right.

A little research revealed that psychologists are totally unaware of the phenomenon. I found one paper in the psychology literature on the time taken by subjects to choose between two alternatives based on how nearly equal they were. The author's theoretical calculation yielded a formula with a singularity at zero, as there should be. He compared the experimental data with this theoretical curve, and the fit was perfect. He then drew, as the curve fitting the data, a bounded continuous graph. The singularity at zero was never mentioned in the paper.

I feel that the arbiter problem is important and should be made known to scientists outside the field of computing. So I wrote this paper, which describes the problem in its classical formulation as the problem of Buridan's ass—an ass that starves to death because it is placed equidistant between two bales of hay and has no reason to prefer one to the other. Philosophers have discussed Buridan's ass for centuries, but it apparently never occurred to any of them that the planet is not littered with dead asses only because the probability of the ass being in just the right spot is infinitesimal.

So, I wrote this paper for the general scientific community. I probably could have published it in some computer journal, but that wasn't the point. I submitted it first to Science. The four reviews ranged from “This well-written paper is of major philosophical importance” to “This may be an elaborate joke.” One of the other reviews was more mildly positive, and the fourth said simply “My feeling is that it is rather superficial.” The paper was rejected.

Some time later, I submitted the paper to Nature. I don't like the idea of sending the same paper to different journals hoping that someone will publish it, and I rarely resubmit a rejected paper elsewhere. So, I said in my submission letter that it had been rejected by Science. The editor read the paper and sent me some objections. I answered his objections, which were based on reasonable misunderstandings of the paper. In fact, they made me realize that I should explain things differently for a more general audience. He then replied with further objections of a similar nature. Throughout this exchange, I wasn't sure if he was taking the matter seriously or if he thought I was some sort of crank. So, after answering his next round of objections, I wrote that I would be happy to revise the paper in light of this discussion if he would then send it out for review, but that I didn't want to continue this private correspondence. The next letter I received was from another Nature editor saying that the first editor had been reassigned and that he was taking over my paper. He then raised some objections to the paper that were essentially the same as the ones raised initially by the first editor. At that point, I gave up in disgust.

I still think that this paper is worth publishing for a general scientific audience. Among other things, it has a nice analysis of a quantum-mechanical arbiter. However, I have no idea where to publish it.

My problems in trying to publish this paper are part of a long tradition. According to one story I've heard (but haven't verified), someone at G. E. discovered the phenomenon in computer circuits in the early 60s, but was unable to convince his managers that there was a problem. He published a short note about it, for which he was fired. Charles Molnar, one of the pioneers in the study of the problem, reported the following in a lecture given on February 11, 1992, at HP Corporate Engineering in Palo Alto, California:

One reviewer made a marvelous comment in rejecting one of the early papers, saying that if this problem really existed it would be so important that everybody knowledgeable in the field would have to know about it, and “I'm an expert and I don't know about it, so therefore it must not exist.”

Lamport's publications page is full of interesting papers.

The Discovery of Debugging

Debugging was a surprise. When the early computer pioneers built the first programmable computers, they assumed that writing programs would be straightforward: think hard, write the program, done.

Maurice Wilkes, creator of the EDSAC, the first stored-program computer, wrote what might be the first programming textbook in 1951, with David Wheeler (inventor of the subroutine call!) and Stanley Gill. It warns: “Experience has shown that such mistakes are much more difficult to avoid than might be expected. It is, in fact, rare for a program to work correctly the first time it is tried, and often several attempts must be made before all errors are eliminated.”

In his memoir, Wilkes recalled the exact moment he realized the importance of debugging: “By June 1949, people had begun to realize that it was not so easy to get a program right as had at one time appeared. It was on one of my journeys between the EDSAC room and the punching equipment that the realization came over me with full force that a good part of the remainder of my life was going to be spent in finding errors in my own programs.”

Brian Hayes's excellent 1993 essay “The Discovery of Debugging” (skip to page 32) describes the history in detail.

Martin Campbell-Kelly's 1992 essay “The Airy Tape: An Early Chapter in the History of Debugging” (subscription required) gives more details on one of the stories in Hayes's article: the successful debugging, after 41 years, of the first hard bug. The bug? A single-precision floating-point value used in a double-precision context.

Hayes's essay itself contains a bug worth mentioning. It says that the first three programs run on the EDSAC ran correctly the first time, an understandable but incorrect inference from Campbell-Kelly's article. In fact the second program ever run, which merely printed primes, was buggy. The EDSAC log for May 7, 1949 reads “Table of primes attempted — programme incorrect.” A correct primes program didn't run until two days later.

Unix Viruses

Disk-based computer viruses came of age with MS-DOS; network-based viruses came of age with Microsoft Windows. (The most stunning recent example is the widespread Storm worm which is at this very minute probably sending you pump-and-dump and greeting card spam.)

Most Linux and OS X users have a frustratingly smug attitude of “we're immune to viruses, because we use Unix.” This is obviously false: the original Internet worm ran on Unix. The simple security mechanisms in Unix-based systems do raise the virus-writing bar slightly, but a more likely explanation of the lack of viruses on Unix-based systems is their lack of current market share: Windows users are simply a much bigger and therefore more attractive target.

Especially for these smug Unix users, it should be sobering to read Tom Duff's 1987 technical report “Viral Attacks on UNIX® System Security,” which describes a simple, benign disk-based virus targeted at Unix executables. Even working under the constraints of the Unix permission bits, Duff managed to infect 466 files across 46 systems over a period of a few months, including an experimental “secure” version of Unix (to its credit, the kernel on that system did detect the virus). Duff's paper also describes a simple virus written in shell script. He cautions:

However sorely you are tempted, do not run this code. It got loose on my machine while being debugged for inclusion in this paper, and within an hour had infected about 140 files, at which time several copies were energetically seeking other files to infect, running the machine's load average, normally between .05 and 1.25, up to about 17. I had to stop the machine in the middle of a work day and spend three hours scouring the disks, earning the ire of ten or so co-workers. I feel extremely fortunate that it did not escape onto the Datakit network.

Doug McIlroy's 1989 paper “Virology 101” (gzipped PostScript) carries the shell script virus farther, developing progressively more virulent strains. McIlroy summarizes his paper:

There is nothing mysterious about computer viruses. A working, but easily observable, virus can be written in a few lines of code. Although particular virus attacks may be guarded against, no general defense within one domain of reference is possible; viruses are a natural consequence of stored-program computation. Like other hazards of technology, their thread may be mitigated by cautious behavior and community sanctions.

(Finally, of course, there is Thompson's “Reflections on Trusting Trust,” but that's a discussion for another day.)