Showing posts with label history. Show all posts
Showing posts with label history. Show all posts

[ next | prev | up ]

Mel was Real

Things have been pretty crazy around here, so no new posts recently, but this caught my attention.

James Grimmelmann reports a few interesting facts about the Librascope/Royal McBee LGP-30. Specifically, Mel was real, and the hexadecimal digits were 0 1 2 3 4 5 6 7 8 9 F G J K Q W!

I can't resist telling a story. James is now a professor at the New York Law School, but he and I were fellow computer science students in college. He was a TA for the undergraduate introductory theory course when I took it, and his duties included, among other more enjoyable tasks, grading the first problem set. The “challenge” problem asked for a proof that in any group of six people, there is either a group of three that all know each other (a clique) or a group of three that all don't know each other (an anti-clique). I didn't see how to prove it*, so I just wrote a simple C program to iterate over all 215 possibilities and verify that the condition holds. I wrote something about using brute force and attached the C program to the problem set. This problem set was just a warm-up in proof techniques, but the course is all about what can and can't be computed by various systems. The pinnacle result, of course, is Turing's proof that some problems are uncomputable and Cook's definition of NP-completeness. In that light, my brute force proof ran against everything the course was trying to teach.

James was the grader for that question. Next to the comment about brute force, he wrote “Wait until the NP-completeness unit, Mr. Brute Force” At the end, he added, “Very, very clever. Such tricks will not avail you much where this course is headed.” I didn't appreciate it then, but in hindsight I find it very funny.

* Truth be told, I still don't see how to prove it. You're supposed to use the pigeonhole principle, and the details have been explained to me in the past, but I can never reconstruct them on demand. The exhaustive search still feels easier, though perhaps less enlightening.

Bi-quinary and other bases

Doug McIlroy's talk on History of Computing at Bell Labs mentioned that George Stibitz's relay-based decimal calculators that stored digits in what he called bi-quinary: “Arithmetic was done in bi-quinary, a two out of five representation for decimal integers, and if there weren't exactly two out of five relays activated it would stop.” This allowed a sequence of computing jobs to run unattended over the weekend: if the computer detected an error (not exactly two relays active in some digit), then it would stop the current job and start over.

McIlroy's description of the number system as being 5 bits differs from any of the representations given by the Wikipedia entries for bi-quinary coded decimal and biquinary, which describe a system that is basically unary: 5 bits to count 0 through 4 (only 1 bit on at a time) and two top bits: 01 = 0, 10 = 5. Other historical summaries you can find online by searching for Stibitz and bi-quinary seem to back up Wikipedia. In this version, bi-quinary was 1-of-2 plus 1-of-5, not 2-of-5. In the 7-bit version, the bit positions are worth 5, 0, 4, 3, 2, 1, and 0, which isn't very interesting. It's far more interesting to think about the 2-out-of-5 system.

The valid codes in a 2-of-5 system like McIlroy described would be 00011, 00101, 00110, 01001, 01010, 01100, 10001, 10010, 10100, 11000. If you assign these the values 0 through 9, you can set up a system of linear equations and solve them to find the value of each of the five bit positions b4, b3, b2, b1, and b0. The fact that 00011 = 0 means that b1 + b0 = 0. The fact that 00110 - 00101 = 2 - 1 = 1 means that b1 - b0 = 1. These two equalities imply b1 = ½ and b0 = -½. Given that you can pick out the other bits fairly easily: from right to left the "places" are -½, ½, 1½, 3½, and 6½. This works for 0 through 8 but then, sadly, fails for 9: 11000 = 6½ + 3½ = 10. No other code assignment has a solution either (I wrote a small program to try them all.)

I guess that's why the 1-of-2 plus 1-of-5 form was more popular than 2-of-5: it was probably easier to design the arithmetic circuits if each bit could be assigned a particular value. It's too bad, because 2-of-5 would have mathematical elegance on its side if it could be made to work.

If you dig around, you can find other esoteric number representations: negabinary has base -2 (negative two), so the positional values are 1, -2, 4, -8, and so on. Thus 5 is 101 but 3 is 111 = 1-2+4. Negabinary appears to have been first invented by Vittorio Grunwald in 1885. It was used in a few experimental computers in the 1950s, but it didn't catch on. Even weirder than negabinary numbers are the quater-imaginary numbers, invented by a high school student named Donald Knuth in 1955. Those use base 2i (where i is the imaginary square root of -1). (Knuth is also responsible for the negabinary history.)

Odder still are balanced ternary (ternary but digits are -1, 0, and 1) and Bubble Babble, but no discussion of interesting counting systems or high school math nerds would be complete without mentioning counting on your fingers in binary.

Computing History at Bell Labs

In 1997, on his retirement from Bell Labs, Doug McIlroy gave a fascinating talk about the “History of Computing at Bell Labs.” That page contains audio for the talk in Real Audio format (it was 1997). Almost ten years ago I transcribed the audio but never did anything with it. The transcript is below.

My favorite parts of the talk are the description of the bi-quinary decimal relay calculator and the description of a team that spent over a year tracking down a race condition bug in a missile detector (reliability was king: today you'd just stamp “cannot reproduce” and send the report back). But the whole thing contains many fantastic stories. It's well worth the read or listen. I also like his recollection of programming using cards: “It's the kind of thing you can be nostalgic about, but it wasn't actually fun.”

For more information, Bernard D. Holbrook and W. Stanley Brown's 1982 technical report “A History of Computing Research at Bell Laboratories (1937-1975)” covers the earlier history in more detail.



Transcript of “History of Computing at Bell Labs:

Computing at Bell Labs is certainly an outgrowth of the mathematics department, which grew from that first hiring in 1897, G A Campbell. When Bell Labs was formally founded in 1925, what it had been was the engineering department of Western Electric. When it was formally founded in 1925, almost from the beginning there was a math department with Thornton Fry as the department head, and if you look at some of Fry's work, it turns out that he was fussing around in 1929 with trying to discover information theory. It didn't actually gel until twenty years later with Shannon.

1:10 Of course, most of the mathematics at that time was continuous. One was interested in analyzing circuits and propagation. And indeed, this is what led to the growth of computing in Bell Laboratories. The computations could not all be done symbolically. There were not closed form solutions. There was lots of numerical computation done. The math department had a fair stable of computers, which in those days meant people. [laughter]

2:00 And in the late '30s, George Stibitz had an idea that some of the work that they were doing on hand calculators might be automated by using some of the equipment that the Bell System was installing in central offices, namely relay circuits. He went home, and on his kitchen table, he built out of relays a binary arithmetic circuit. He decided that binary was really the right way to compute. However, when he finally came to build some equipment, he determined that binary to decimal conversion and decimal to binary conversion was a drag, and he didn't want to put it in the equipment, and so he finally built in 1939, a relay calculator that worked in decimal, and it worked in complex arithmetic. Do you have a hand calculator now that does complex arithmetic? Ten-digit, I believe, complex computations: add, subtract, multiply, and divide. The I/O equipment was teletypes, so essentially all the stuff to make such machines out of was there. Since the I/O was teletypes, it could be remotely accessed, and there were in fact four stations in the West Street Laboratories of Bell Labs. West Street is down on the left side of Manhattan. I had the good fortune to work there one summer, right next to a district where you're likely to get bowled over by rolling ?beads? hanging from racks or tumbling ?cabbages?. The building is still there. It's called Westbeth Apartments. It's now an artist's colony.

4:29 Anyway, in West Street, there were four separate remote stations from which the complex calculator could be accessed. It was not time sharing. You actually reserved your time on the machine, and only one of the four terminals worked at a time. In 1940, this machine was shown off to the world at the AMS annual convention, which happened to be held in Hanover at Dartmouth that year, and mathematicians could wonder at remote computing, doing computation on an electromechanical calculator at 300 miles away.

5:22 Stibitz went on from there to make a whole series of relay machines. Many of them were made for the government during the war. They were named, imaginatively, Mark I through Mark VI. I have read some of his patents. They're kind of fun. One is a patent on conditional transfer. [laughter] And how do you do a conditional transfer? Well these gadgets were, the relay calculator was run from your fingers, I mean the complex calculator. The later calculators, of course, if your fingers were a teletype, you could perfectly well feed a paper tape in, because that was standard practice. And these later machines were intended really to be run more from paper tape. And the conditional transfer was this: you had two teletypes, and there's a code that says "time to read from the other teletype". Loops were of course easy to do. You take paper and [laughter; presumably Doug curled a piece of paper to form a physical loop]. These machines never got to the point of having stored programs. But they got quite big. I saw, one of them was here in 1954, and I did see it, behind glass, and if you've ever seen these machines in the, there's one in the Franklin Institute in Philadelphia, and there's one in the Science Museum in San Jose, you know these machines that drop balls that go wandering sliding around and turning battle wheels and ringing bells and who knows what. It kind of looked like that. It was a very quiet room, with just a little clicking of relays, which is what a central office used to be like. It was the one air-conditioned room in Murray Hill, I think. This machine ran, the Mark VI, well I think that was the Mark V, the Mark VI actually went to Aberdeen. This machine ran for a good number of years, probably six, eight. And it is said that it never made an undetected error. [laughter]

8:30 What that means is that it never made an error that it did not diagnose itself and stop. Relay technology was very very defensive. The telephone switching system had to work. It was full of self-checking, and so were the calculators, so were the calculators that Stibitz made.

9:04 Arithmetic was done in bi-quinary, a two out of five representation for decimal integers, and if there weren't exactly two out of five relays activated it would stop. This machine ran unattended over the weekends. People would bring their tapes in, and the operator would paste everybody's tapes together. There was a beginning of job code on the tape and there was also a time indicator. If the machine ran out of time, it automatically stopped and went to the next job. If the machine caught itself in an error, it backed up to the current job and tried it again. They would load this machine on Friday night, and on Monday morning, all the tapes, all the entries would be available on output tapes.

Question: I take it they were using a different representation for loops and conditionals by then.

Doug: Loops were done actually by they would run back and forth across the tape now, on this machine.

10:40 Then came the transistor in '48. At Whippany, they actually had a transistorized computer, which was a respectable minicomputer, a box about this big, running in 1954, it ran from 1954 to 1956 solidly as a test run. The notion was that this computer might fly in an airplane. And during that two-year test run, one diode failed. In 1957, this machine called TRADIC, did in fact fly in an airplane, but to the best of my knowledge, that machine was a demonstration machine. It didn't turn into a production machine. About that time, we started buying commercial machines. It's wonderful to think about the set of different architectures that existed in that time. The first machine we got was called a CPC from IBM. And all it was was a big accounting machine with a very special plugboard on the side that provided an interpreter for doing ten-digit decimal arithmetic, including opcodes for the trig functions and square root.

12:30 It was also not a computer as we know it today, because it wasn't stored program, it had twenty-four memory locations as I recall, and it took its program instead of from tapes, from cards. This was not a total advantage. A tape didn't get into trouble if you dropped it on the floor. [laughter]. CPC, the operator would stand in front of it, and there, you would go through loops by taking cards out, it took human intervention, to take the cards out of the output of the card reader and put them in the ?top?. I actually ran some programs on the CPC ?...?. It's the kind of thing you can be nostalgic about, but it wasn't actually fun. [laughter]

13:30 The next machine was an IBM 650, and here, this was a stored program, with the memory being on drum. There was no operating system for it. It came with a manual: this is what the machine does. And Michael Wolontis made an interpreter called the L1 interpreter for this machine, so you could actually program in, the manual told you how to program in binary, and L1 allowed you to give something like 10 for add and 9 for subtract, and program in decimal instead. And of course that machine required interesting optimization, because it was a nice thing if the next program step were stored somewhere -- each program step had the address of the following step in it, and you would try to locate them around the drum so to minimize latency. So there were all kinds of optimizers around, but I don't think Bell Labs made ?...? based on this called "soap" from Carnegie Mellon. That machine didn't last very long. Fortunately, a machine with core memory came out from IBM in about '56, the 704. Bell Labs was a little slow in getting one, in '58. Again, the machine came without an operating system. In fact, but it did have Fortran, which really changed the world. It suddenly made it easy to write programs. But the way Fortran came from IBM, it came with a thing called the Fortran Stop Book. This was a list of what happened, a diagnostic would execute the halt instruction, the operator would go read the panel lights and discover where the machine had stopped, you would then go look up in the stop book what that meant. Bell Labs, with George Mealy and Glenn Hanson, made an operating system, and one of the things they did was to bring the stop book to heel. They took the compiler, replaced all the stop instructions with jumps to somewhere, and allowed the program instead of stopping to go on to the next trial. By the time I arrived at Bell Labs in 1958, this thing was running nicely.

16:36 Bell Labs continued to be a major player in operating systems. This was called BESYS. BE was the share abbreviation for Bell Labs. Each company that belonged to Share, which was the IBM users group, ahd a two letter abbreviation. It's hard to imagine taking all the computer users now and giving them a two-letter abbreviation. BESYS went through many generations, up to BESYS 5, I believe. Each one with innovations. IBM delivered a machine, the 7090, in 1960. This machine had interrupts in it, but IBM didn't use them. But BESYS did. And that sent IBM back to the drawing board to make it work. [Laughter]

17:48 Rob Pike: It also didn't have memory protection.

Doug: It didn't have memory protection either, and a lot of people actually got IBM to put memory protection in the 7090, so that one could leave the operating system resident in the presence of a wild program, an idea that the PC didn't discover until, last year or something like that. [laughter]

Big players then, Dick Hamming, a name that I'm sure everybody knows, was sort of the numerical analysis guru, and a seer. He liked to make outrageous predictions. He predicted in 1960, that half of Bell Labs was going to be busy doing something with computers eventually. ?...? exaggerating some ?...? abstract in his thought. He was wrong. Half was a gross underestimate. Dick Hamming retired twenty years ago, and just this June he completed his full twenty years term in the Navy, which entitles him again to retire from the Naval Postgraduate Institute in Monterey. Stibitz, incidentally died, I think within the last year. He was doing medical instrumentation at Dartmouth essentially, near the end.

20:00 Various problems intrigued, besides the numerical problems, which in fact were stock in trade, and were the real justification for buying machines, until at least the '70s I would say. But some non-numerical problems had begun to tickle the palette of the math department. Even G A Campbell got interested in graph theory, the reason being he wanted to think of all the possible ways you could take the three wires and the various parts of the telephone and connect them together, and try permutations to see what you could do about reducing side ?...? by putting things into the various parts of the circuit, and devised every possibly way of connecting the telephone up. And that was sort of the beginning of combinatorics at Bell Labs. John Reardon, a mathematician parlayed this into a major subject. Two problems which are now deemed as computing problems, have intrigued the math department for a very long time, and those are the minimum spanning tree problem, and the wonderfully ?comment about Joe Kruskal, laughter?

21:50 And in the 50s Bob Prim and Kruskal, who I don't think worked on the Labs at that point, invented algorithms for the minimum spanning tree. Somehow or other, computer scientists usually learn these algorithms, one of the two at least, as Dijkstra's algorithm, but he was a latecomer.

Another pet was the traveling salesman. There's been a long list of people at Bell Labs who played with that: Shen Lin and Ron Graham and David Johnson and dozens more, oh and ?...?. And then another problem is the Steiner minimum spanning tree, where you're allowed to add points to the graph. Every one of these problems grew, actually had a justification in telephone billing. One jurisdiction or another would specify that the way you bill for a private line network was in one jurisdiction by the minimum spanning tree. In another jurisdiction, by the traveling salesman route. NP-completeness wasn't a word in the vocabulary of ?...? [laughter]. And the Steiner problem came up because customers discovered they could beat the system by inventing offices in the middle of Tennessee that had nothing to do with their business, but they could put the office at a Steiner point and reduce their phone bill by adding to what the service that the Bell System had to give them. So all of these problems actually had some justification in billing besides the fun.

24:15 Come the 60s, we actually started to hire people for computing per se. I was perhaps the third person who was hired with a Ph.D. to help take care of the computers and I'm told that the then director and head of the math department, Henry Boden, had said to his people, "yeah, you can hire this guy, instead of a real mathematician, but what's he gonna be doing in five years?" [laughter]

25:02 Nevertheless, we started hiring for real in about '67. Computer science got split off from the math department. I had the good fortune to move into the office that I've been in ever since then. Computing began to make, get a personality of its own. One of the interesting people that came to Bell Labs for a while was Hal Long. Is his name well known? [Pause] One nod. Hal Long was a philosopher and logician, and we got a letter from him in England out of the blue saying "hey you know, can I come and use your computers? I have an idea about theorem proving." There was theorem proving in the air in the late 50s, and it was mostly pretty thin stuff. Obvious that the methods being proposed wouldn't possibly do anything more difficult than solve tic-tac-toe problems by enumeration. Long had a notion that he could mechanically prove theorems in the style of Whitehead and Russell's great treatise Principia Mathematica in the early patr of the century. He came here, learned how to program in machine language, and took all of Volume I of Principia Mathematica -- if you've ever hefted Principia, well that's about all it's good for, it's a real good door stop. It's really big. But it's theorem after theorem after theorem in propositional calculus. Of course, there's a decision procedure for propositional calculus, but he was proving them more in the style of Whitehead and Russell. And when he finally got them all coded and put them into the computer, he proved the entire contents of this immense book in eight minutes. This was actually a neat accomplishment. Also that was the beginning of all the language theory. We hired people like Al Aho and Jeff Ullman, who probed around every possible model of grammars, syntax, and all of the things that are now in the standard undergraduate curriculum, were pretty well nailed down here, on syntax and finite state machines and so on were pretty well nailed down in the 60s. Speaking of finite state machines, in the 50s, both Mealy and Moore, who have two of the well-known models of finite state machines, were here.

28:40 During the 60s, we undertook an enormous development project in the guise of research, which was MULTICS, and it was the notion of MULTICS was computing was the public utility of the future. Machines were very expensive, and ?indeed? like you don't own your own electric generator, you rely on the power company to do generation for you, and it was seen that this was a good way to do computing -- time sharing -- and it was also recognized that shared data was a very good thing. MIT pioneered this and Bell Labs joined in on the MULTICS project, and this occupied five years of system programming effort, until Bell Labs pulled out, because it turned out that MULTICS was too ambitious for the hardware at the time, and also with 80 people on it was not exactly a research project. But, that led to various people who were on the project, in particular Ken Thompson -- right there -- to think about how to -- Dennis Ritchie and Rudd Canaday were in on this too -- to think about how you might make a pleasant operating system with a little less resources.

30:30 And Ken found -- this is a story that's often been told, so I won't go into very much of unix -- Ken found an old machine cast off in the corner, the PDP-7, and put up this little operating system on it, and we had immense GE635 available at the comp center at the time, and I remember as the department head, muscling in to use this little computer to be, to get to be Unix's first user, customer, because it was so much pleasanter to use this tiny machine than it was to use the big and capable machine in the comp center. And of course the rest of the story is known to everybody and has affected all college campuses in the country.

31:33 Along with the operating system work, there was a fair amount of language work done at Bell Labs. Often curious off-beat languages. One of my favorites was called Blodi, B L O D I, a block diagram compiler by Kelly and Vyssotsky. Perhaps the most interesting early uses of computers in the sense of being unexpected, were those that came from the acoustics research department, and what the Blodi compiler was invented in the acoustic research department for doing digital simulations of sample data system. DSPs are classic sample data systems, where instead of passing analog signals around, you pass around streams of numerical values. And Blodi allowed you to say here's a delay unit, here's an amplifier, here's an adder, the standard piece parts for a sample data system, and each one was described on a card, and with description of what it's wired to. It was then compiled into one enormous single straight line loop for one time step. Of course, you had to rearrange the code because some one part of the sample data system would feed another and produce really very efficient 7090 code for simulating sample data systems. By in large, from that time forth, the acoustic department stopped making hardware. It was much easier to do signal processing digitally than previous ways that had been analog. Blodi had an interesting property. It was the only programming language I know where -- this is not my original observation, Vyssotsky said -- where you could take the deck of cards, throw it up the stairs, and pick them up at the bottom of the stairs, feed them into the computer again, and get the same program out. Blodi had two, aside from syntax diagnostics, it did have one diagnostic when it would fail to compile, and that was "somewhere in your system is a loop that consists of all delays or has no delays" and you can imagine how they handled that.

35:09 Another interesting programming language of the 60s was Ken Knowlten's Beflix. This was for making movies on something with resolution kind of comparable to 640x480, really coarse, and the programming notion in here was bugs. You put on your grid a bunch of bugs, and each bug carried along some data as baggage, and then you would do things like cellular automata operations. You could program it or you could kind of let it go by itself. If a red bug is next to a blue bug then it turns into a green bug on the following step and so on. 36:28 He and Lillian Schwartz made some interesting abstract movies at the time. It also did some interesting picture processing. One wonderful picture of a reclining nude, something about the size of that blackboard over there, all made of pixels about a half inch high each with a different little picture in it, picked out for their density, and so if you looked at it close up it consisted of pickaxes and candles and dogs, and if you looked at it far enough away, it was a reclining nude. That picture got a lot of play all around the country.

Lorinda Cherry: That was with Leon, wasn't it? That was with Leon Harman.

Doug: Was that Harman?

Lorinda: ?...?

Doug: Harman was also an interesting character. He did more things than pictures. I'm glad you reminded me of him. I had him written down here. Harman was a guy who among other things did a block diagram compiler for writing a handwriting recognition program. I never did understand how his scheme worked, and in fact I guess it didn't work too well. [laughter] It didn't do any production ?things? but it was an absolutely immense sample data circuit for doing handwriting recognition. Harman's most famous work was trying to estimate the information content in a face. And every one of these pictures which are a cliche now, that show a face digitized very coarsely, go back to Harman's first psychological experiments, when he tried to find out how many bits of picture he needed to try to make a face recognizable. He went around and digitized about 256 faces from Bell Labs and did real psychological experiments asking which faces could be distinguished from other ones. I had the good fortune to have one of the most distinguishable faces, and consequently you'll find me in freshman psychology texts through no fault of my own.

39:15 Another thing going on the 60s was the halting beginning here of interactive computing. And again the credit has to go to the acoustics research department, for good and sufficient reason. They wanted to be able to feed signals into the machine, and look at them, and get them back out. They bought yet another weird architecture machine called the Packard Bell 250, where the memory elements were mercury delay lines.

Question: Packard Bell?

Doug: Packard Bell, same one that makes PCs today.

40:10 They hung this off of the comp center 7090 and put in a scheme for quickly shipping jobs into the job stream on the 7090. The Packard Bell was the real-time terminal that you could play with and repair stuff, ?...? off the 7090, get it back, and then you could play it. From that grew some graphics machines also, built by ?...? et al. And it was one of the old graphics machines in fact that Ken picked up to build Unix on.

40:55 Another thing that went on in the acoustics department was synthetic speech and music. Max Mathews, who was the the director of the department has long been interested in computer music. In fact since retirement he spent a lot of time with Pierre Boulez in Paris at a wonderful institute with lots of money simply for making synthetic music. He had a language called Music 5. Synthetic speech or, well first of all simply speech processing was pioneered particularly by John Kelly. I remember my first contact with speech processing. It was customary for computer operators, for the benefit of computer operators, to put a loudspeaker on the low bit of some register on the machine, and normally the operator would just hear kind of white noise. But if you got into a loop, suddenly the machine would scream, and this signal could be used to the operator "oh the machines in a loop. Go stop it and go on to the next job." I remember feeding them an Ackermann's function routine once. [laughter] They were right. It was a silly loop. But anyway. One day, the operators were ?...?. The machine started singing. Out of the blue. ?...? from out of the blue. [laughter] And in a broad Texas accent, which was the recorded voice of John Kelly.

43:14 However. From there Kelly went on to do some speech synthesis. Of course there's been a lot more speech synthesis work done since, by 43:31 folks like Cecil Coker, Joe Olive. But they produced a record, which unfortunately I can't play because records are not modern anymore. And everybody got one in the Bell Labs Record, which is a magazine, contained once a record from the acoustics department, with both speech and music and one very famous combination where the computer played and sang "A Bicycle Built For Two".

?...?

44:32 At the same time as all this stuff is going on here, needless to say computing is going on in the rest of the Labs. it was about early 1960 when the math department lost its monopoly on computing machines and other people started buying them too, but for switching. The first experiments with switching computers were operational in around 1960. They were planned for several years prior to that; essentially as soon as the transistor was invented, the making of electronic rather than electromechanical switching machines was anticipated. Part of the saga of the switching machines is cheap memory. These machines had enormous memories -- thousands of words. [laughter] And it was said that the present worth of each word of memory that programmers saved across the Bell System was something like eleven dollars, as I recall. And it was worthwhile to struggle to save some memory. Also, programs were permanent. You were going to load up the switching machine with switching program and that was going to run. You didn't change it every minute or two. And it would be cheaper to put it in read only memory than in core memory. And there was a whole series of wild read-only memories, both tried and built. The first experimental Essex System had a thing called the ?flying squat scoop store? which was large photographic plates with bits on them and CRTs projecting on the plates and you would detect underneath on the photodetector whether the bit was set or not. That was the program store of Essex. The program store of the first ESS systems consisted of twisters, which I actually am not sure I understand to this day, but they consist of iron wire with a copper wire wrapped around them and vice versa. There were also experiments with an IC type memory called the waffle iron. Then there was a period when magnetic bubbles were all the rage. As far as I know, although microelectronics made a lot of memory, most of the memory work at Bell Labs has not had much effect on ?...?. Nice tries though.

48:28 Another thing that folks began to work on was the application of (and of course, right from the start) computers to data processing. When you owned equipment scattered through every street in the country, and you have a hundred million customers, and you have bills for a hundred million transactions a day, there's really some big data processing going on. And indeed in the early 60s, AT&T was thinking of making its own data processing computers solely for billing. Somehow they pulled out of that, and gave all the technology to IBM, and one piece of that technology went into use in high end equipment called tractor tapes. Inch wide magnetic tapes that would be used for a while.

49:50 By in large, although Bell Labs has participated until fairly recently in data processing in quite a big way, AT&T never really quite trusted the Labs to do it right because here is where the money is. I can recall one occasion when during strike of temporary employees, a fill-in employee like from the Laboratories and so on, lost a day's billing tape in Chicago. And that was a million dollars. And that's generally speaking the money people did not until fairly recently trust Bell Labs to take good care of money, even though they trusted the Labs very well to make extremely reliable computing equipment for switches. The downtime on switches is still spectacular by any industry standards. The design for the first ones was two hours down in 40 years, and the design was met. Great emphasis on reliability and redundancy, testing.

51:35 Another branch of computing was for the government. The whole Whippany Laboratories [time check] Whippany, where we took on contracts for the government particularly in the computing era in anti-missile defense, missile defense, and underwater sound. Missile defense was a very impressive undertaking. It was about in the early '63 time frame when it was estimated the amount of computation to do a reasonable job of tracking incoming missiles would be 30 M floating point operations a second. In the day of the Cray that doesn't sound like a great lot, but it's more than your high end PCs can do. And the machines were supposed to be reliable. They designed the machines at Whippany, a twelve-processor multiprocessor, to no specs, enormously rugged, one watt transistors. This thing in real life performed remarkably well. There were sixty-five missile shots, tests across the Pacific Ocean ?...? and Lorinda Cherry here actually sat there waiting for them to come in. [laughter] And only a half dozen of them really failed. As a measure of the interest in reliability, one of them failed apparently due to processor error. Two people were assigned to look at the dumps, enormous amounts of telemetry and logging information were taken during these tests, which are truly expensive to run. Two people were assigned to look at the dumps. A year later they had not found the trouble. The team was beefed up. They finally decided that there was a race condition in one circuit. They then realized that this particular kind of race condition had not been tested for in all the simulations. They went back and simulated the entire hardware system to see if its a remote possibility of any similar cases, found twelve of them, and changed the hardware. But to spend over a year looking for a bug is a sign of what reliability meant.

54:56 Since I'm coming up on the end of an hour, one could go on and on and on,

Crowd: go on, go on. [laughter]

55:10 Doug: I think I'd like to end up by mentioning a few of the programs that have been written at Bell Labs that I think are most surprising. Of course there are lots of grand programs that have been written.

I already mentioned the block diagram compiler.

Another really remarkable piece of work was eqn, the equation typesetting language, which has been imitated since, by Lorinda Cherry and Brian Kernighan. The notion of taking an auditory syntax, the way people talk about equations, but only talk, this was not borrowed from any written notation before, getting the auditory one down on paper, that was very successful and surprising.

Another of my favorites, and again Lorinda Cherry was in this one, with Bob Morris, was typo. This was a program for finding spelling errors. It didn't know the first thing about spelling. It would read a document, measure its statistics, and print out the words of the document in increasing order of what it thought the likelihood of that word having come from the same statistical source as the document. The words that did not come from the statistical source of the document were likely to be typos, and now I mean typos as distinct from spelling errors, where you actually hit the wrong key. Those tend to be off the wall, whereas phonetic spelling errors you'll never find. And this worked remarkably well. Typing errors would come right up to the top of the list. A really really neat program.

57:50 Another one of my favorites was by Brenda Baker called struct, which took Fortran programs and converted them into a structured programming language called Ratfor, which was Fortran with C syntax. This seemed like a possible undertaking, like something you do by the seat of the pants and you get something out. In fact, folks at Lockheed had done things like that before. But Brenda managed to find theorems that said there's really only one form, there's a canonical form into which you can structure a Fortran program, and she did this. It took your Fortran program, completely mashed it, put it out perhaps in almost certainly a different order than it was in Fortran connected by GOTOs, without any GOTOs, and the really remarkable thing was that authors of the program who clearly knew the way they wrote it in the first place, preferred it after it had been rearranged by Brendan. I was astonished at the outcome of that project.

59:19 Another first that happened around here was by Fred Grampp, who got interested in computer security. One day he decided he would make a program for sniffing the security arrangements on a computer, as a service: Fred would never do anything crooked. [laughter] This particular program did a remarkable job, and founded a whole minor industry within the company. A department was set up to take this idea and parlay it, and indeed ever since there has been some improvement in the way computer centers are managed, at least until we got Berkeley Unix.

60:24 And the last interesting program that I have time to mention is one by Ken Church. He was dealing with -- text processing has always been a continuing ?...? of the research, and in some sense it has an application to our business because we're handling speech, but he got into consulting with the department in North Carolina that has to translate manuals. There are millions of pages of manuals in the Bell System and its successors, and ever since we've gone global, these things had to get translated into many languages.

61:28 To help in this, he was making tools which would put up on the screen, graphed on the screen quickly a piece of text and its translation, because a translator, particularly a technical translator, wants to know, the last time we mentioned this word how was it translated. You don't want to be creative in translating technical text. You'd like to be able to go back into the archives and pull up examples of translated text. And the neat thing here is the idea for how do you align texts in two languages. You've got the original, you've got the translated one, how do you bring up on the screen, the two sentences that go together? And the following scam worked beautifully. This is on western languages. 62:33 Simply look for common four letter tetragrams, four letter combinations between the two and as best as you can, line them up as nearly linearly with the lengths of the two types as possible. And this very simple idea works like storm. Something for nothing. I like that.

63:10 The last thing is one slogan that sort of got started with Unix and is just rife within the industry now. Software tools. We were making software tools in Unix before we knew we were, just like the Molière character was amazed at discovering he'd been speaking prose all his life. [laughter] But then Kernighan and Plauger came along and christened what was going on, making simple generally useful and compositional programs to do one thing and do it well and to fit together. They called it software tools, made a book, wrote a book, and this notion now is abroad in the industry. And it really did begin all up in the little attic room where you [points?] sat for many years writing up here.

Oh I forgot to. I haven't used any slides. I've brought some, but I don't like looking at bullets and you wouldn't either, and I forgot to show you the one exhibit I brought, which I borrowed from Bob Kurshan. When Bell Labs was founded, it had of course some calculating machines, and it had one wonderful computer. This. That was bought in 1918. There's almost no other computing equipment from any time prior to ten years ago that still exists in Bell Labs. This is an integraph. It has two styluses. You trace a curve on a piece of paper with one stylus and the other stylus draws the indefinite integral here. There was somebody in the math department who gave this service to the whole company, with about 24 hours turnaround time, calculating integrals. Our recent vice president Arno Penzias actually did, he calculated integrals differently, with a different background. He had a chemical balance, and he cut the curves out of the paper and weighed them. This was bought in 1918, so it's eighty years old. It used to be shiny metal, it's a little bit rusty now. But it still works.

66:30 Well, that's a once over lightly of a whole lot of things that have gone on at Bell Labs. It's just such a fun place that one I said I just could go on and on. If you're interested, there actually is a history written. This is only one of about six volumes, this is the one that has the mathematical computer sciences, the kind of things that I've mostly talked about here. A few people have copies of them. For some reason, the AT&T publishing house thinks that because they're history they're obsolete, and they stopped printing them. [laughter]

Thank you, and that's all.

Alphabetical Order

A reading from the Book of Knuth, Chapter 6:

The use of alphabetic order for entire words seems to be a much later invention; it is something we might think is obvious, yet it has to be taught to children, and at some point in history it was necessary to teach to adults. Several lists from about 300 B.C. have been found on the Aegean Islands, giving the names of people in certain religious cults; these lists have been alphabetized, but only by first letter, thus representing only the first pass of a left-to-right radix sort. Some Greek papyri from the years A.D. 134-135 contain fragments of ledgers that show the names of taxpayers alphabetized by the first two letters. Apollonius Sophista used alphabetical order on the first two letters, and often on subsequent letters, in his lengthy concordance of Homer's poetry (first century A.D.). A few examples of more perfect alphabetization are known, notably Galen's Hippocratic Glosses (c. 200), but they are very rare. Words were arranged by their first letter only in the Etymologiarum of St. Isidorus (c. 630, Book x); and the Corpus Glossary (c. 725) used only the first two letters of each word. The latter two works were perhaps the largest nonnumerical files of data to be compiled during the Middle Ages.

It is not until Giovanni di Genoa's Catholicon (1286) that we find a specific description of true alphabetical order. In his preface, Giovanni explained that

amo precedes bibo
abeo precedes adeo
amatus precedes amor
imprudens precedes impudens
iustica precedes iustus
polisintheton precedes polissenus

(thereby giving examples of situations in which the ordering is determined by the 1st, 2nd, ..., 6th letters), “and so on in like manner.” He remarked that strenuous effort was required to device these rules. “I beg of you, therefore, good reader, do not scorn this great labor of mine and this order as something worthless.”

A detailed study of the development of alphabetic order, up to the time printing was invented, has been made by Lloyd W. Daly [Collection Latomus 90 (1967), 100 pages]. He found some interesting old manuscripts that were evidently used as worksheets while sorting words by their first letters (see pages 89-90 of his monograph).

The first dictionary of English, Robert Cawdrey's Table Alphabeticall (London, 1604), contains the following instructions:

Nowe if the word, which thou art desirous to finde, beginne with (a) then looke in the beginning of this Table, but if with (v) looke toward the end. Again, if they word beginne with (ca) looke in the beginning of the letter (c) but if with (cu) then looke toward the end of that letter. And so on of all the rest. &c.

Cawdrey seems to have been teaching himself how to alphabetize as he prepared his dictionary; numerous misplaced words appear on the first few pages, but the alphabetic order in the last part is not as bad.

Volume 3 / Sorting and Searching, 2nd Ed., p. 421

I've written before about how thinking recursively, something most of us take for granted, is in fact something that must be taught (or discovered). I wonder if alphabetical order took so long to catch on because it is a recursive procedure.

Rotating Hashes

Long ago, before hash tables (aka scatter storage files) were a fundamental data type like int, programmers had to implement them by hand. (Nowadays, manually-implemented hash tables are rarely seen outside of historical reenactments like undergraduate algorithms classes and C programs.) Back then, there was fierce debate about how to handle hash collisions.

One option, called direct chaining, makes each table slot a linked list head, and then all the entries that hash to that slot are kept in the linked list pointed to by the slot. (Bonus points if you reorder the lists using the move to front heuristic.)

Another option, called linear scanning or open addressing, is to try table entry h, and then if that is full, try h+1, h+2, and so on until an empty slot is found. This only works well if the table is sized so that it has some large fraction (say, 1/4) of unused slots.

Direct chaining and linear scanning are still the most common collision resolution methods, but in the 1960s, more were explored. Still another option is to use non-linear scanning, so that each data item produces a sequence of hash values h1, h2, h3, and those table slots are tried in turn. Two objects with the same h1 won't necessarily have the same h2, unlike in linear scanning. But hashing is a relatively slow operation. Is it worth the time and code to compute all those extra hash functions?

This sets the stage for Doug McIlroy's 1963 letter to the Communications of the ACM Pracniques page, describing a trick by Vic Vyssotsky:

A VARIANT METHOD OF FILE SEARCHING

I would like to publicize a trick due to V. A. Vyssotsky for improving the efficiency of scatter storage files.

Vyssotsky's idea, used in a remarkably short and elegant code for the IBM 7090, is to continue random searching after the first probe fails:

(1) From a data item, compute a pseudo-random address in the file.
(2) If the item is in this place, or if the place is empty, the job is done.
(3) If another item is there, probe again with a new address computed by another pseudo-random function, until the stock of pseudo-random functions has been exhausted.
(4) As a last resort, when all pseudo-random functions are exhausted, do a linear search of the whole file for the item or a hole, whichever comes first.

In Vyssotsky's program, the later random functions are trivially obtained from the first one. He computes an n-bit function of which only the first m bits are used as an address. Later, random addresses are obtained by rotating the same n bits. After n probes he gives up and resorts to linear search.

More familiar methods of scatter storage employ chaning or linear search for later probes. Chaining is the most efficient method in terms of average number of probes per item but pays a price in storage space for the chaining information. Linear search is expensive in time. Table I, comparing average number of probes per item for these methods, is based on popular uniformity and independence assumptions:

TABLE I. Average Number of Probes per Item
Load Factor Chaining [1] Linear Search [2] Random Search
p 1 + p/2 1 + ½p/(1-p) -p-1 log(1-p)
.50 1.25 1.5 1.39
.75 1.38 2.5 1.83
.90 1.45 5.5 2.56

Chaining and random search have a clear edge. If storage requirements permit, chaining is superior. Vyssotsky's random search pays an attractively low time penalty to achieve ultimate storage utilization in pressing situations.

A typical 7090 application used a file of two-word items. Chaining would add an extra word or, with extra coding complexity to handle items of nonintegral word length, an extra half word per item. Table II compares lookup times for various file schemes given a certain amount of storage:

TABLE II. Lookup Times for Various File Schemes
Load Factor with
3-word items
Chaining
3-word
Chaining
2½-word
Linear
2-word
Random
2-word
0.6 1.3 1.25 1.33 1.30
0.8 1.4 1.33 1.57 1.49
1.0 1.5 1.42 2.00 1.65
1.2 overflow 1.50 3.00 2.01
1.4 overflow overflow 6.00 2.90

References:

1. Johnson, L. R. Indirect chaining method for addressing on secondary keys. Comm. ACM 4 (1961), 218-223.

2. Schay, G. and Spruth, W. G. Analysis of a file addressing method. Comm. ACM 5 (1962), 459-462.


M. D. McIlroy
Bell Telephone Laboratories, Inc.
Murray Hill, N. J.

(Posting may be intermittent for the rest of March.)

Face the Nation

The Hideous Name” makes mention of a face server, perhaps the first user-level file system. The face server's main purpose was to provide faces for vismon, which showed a face for each message in a user's (electronic) mail box. Vismon is described in Rob Pike and Dave Presotto's 1985 Usenix paper “Face the Nation.”

(The appearance of grey scale is due only to the image being shrunk. Click on the image for the full-size version.) This example shows the time, system load, a graph of cpu time, faces for the five most recent emails, and the names of the two most recent senders.

The AT&T 5620 Retrocomputing page contains a file tree used in the commercially distributed system, including the face set:

(The mailbox with the tail, labeled md, is actually mailer-daemon, but that label didn't fit.)

The faces include Brian Kernighan (bwk), Dennis Ritchie (dmr), Ken Thompson (ken), and Rob Pike (rob). Curiously, Peter Weinberger's face is missing. It was featured prominently in the paper and in many other contexts, including water towers. It is also used as the default “unknown” icon in vismon. Here it is in its two vismon instantiations:

The spirit of vismon lives on in Plan 9's faces (née seemail), Unix's faces, and OS X's MailGlance, among others.

Vismon also inspired in 1987 the creation of the Usenix FaceSaver project. Digging around in the archives you can find pictures of luminaries such as Kirk McKusick, Dennis Ritchie, Henry Spencer, and Larry Wall.

The Picons Archive has even more pictures.

Permissive Access Links

In 2004, Steve Bellovin gave a talk at Usenix Security speculating about permissive access links (PALs), the (supposedly impossible to bypass) locks that protect nuclear weapons. He repeated the talk in 2006 at the general Usenix. In Bellovin's talk, “Nuclear Weapons, Permissive Action Links, and the History of Public Key Cryptography” (MP3; also PDF and HTML), he says that “Bypassing a PAL should be, as one weapons designer graphically put it, about as complex as performing a tonsillectomy while entering the patient from the wrong end.” But how do they work? Are there lessons that apply to building other kinds of secure systems? He touches on these questions, but in the end, it's mostly speculation. Even so, it's a fascinating talk.

He does tease out a few interesting historical details. In particular, National Security Action Memorandum 160, signed by President Kennedy, has been claimed by former NSA insiders to be the impetus for the NSA's invention of public key cryptography. There is no evidence that public key cryptography ended up being used in PALs, but it's possible that digital signatures were invented in direct response to the requirement that, after a weapon was launched, it be possible to determine who authorized the launch. It's also possible that public key cryptography was invented and used to transmit the PAL codes securely.

Other interesting facts. The U.S. offered PALs to the Soviets (presumably to keep weapons from falling into other hands), but they turned them down. For years after the initial U.S. PAL deployments, the launch codes were all set to 00000000. The bandwidth of the extra-long frequency extremely low frequency (ELF) communication link to submerged submarines is 1 bit/minute.

Worms and Distributed Computations

John Shoch and Jon Hupp's 1982 paper “The ‘Worm’ Programs—Early Experience with a Distributed Computation” (PDF, also HTML) discusses a fascinating series of distributed computation experiments carried out on a network of Altos at Xerox PARC. An earlier post discussed the Alto file system.

The Alto had memory to run only one program at a time and lacked virtual memory. It multitasked by swapping one program (and often much of the operating system) completely to disk and then loading a different one. Such a “world swap” took about two seconds. Because of this usage model, users tended to close their programs before going home, leaving the machine idle for other uses at night. Because they came out at night, the programs were sometimes called “vampire programs.”

Shoch and Hupp called their programs “worms” after the electronic tapeworm created in John Brunner's 1975 book The Shockwave Rider*, although the distributed computation ideas probably predate the book. (They lament that the phrase “distributed computing” “has already been co-opted by those who market fairly ordinary terminal systems,” although, at least in my experience, the phrase seems to have been restored to its original meaning.)

The paper discusses a handful of possible distributed computations, ranging from a simple “hello,world” to a multimachine animation computation. Perhaps the most interesting part is Section 5, which gives some of the history of multimachine programs on the Arpanet (the predecessor of the Internet). Of course, there are the network's routers themselves; maintenance of routing tables is almost certainly the longest-running distributed computation in history. The paper gives a few other examples, most notably a multimachine air traffic control simulation called McRoss (multicomputer route oriented simulation system). McRoss partitioned the air space among the computers running the simulations; particular planes were handed off from one computer to another as they flew around the air space.

The summary concludes:

This summary is probably not complete or fully accurate, but it is an impressive collection of distributed computations, produced within or on top of the Arpanet. Much of this work, however, was done in the early 70s; one participant recently commented, “It's hard for me to believe that this all happened seven years ago.” Since that time, we have not witnessed the anticipated blossoming of many distributed applications using the long-haul capabilities of the Arpanet.

In 2008, it's still hard to believe that this all happened 30 years ago. In the past decade, such distributed computations have really taken off (SETI@home started in 1999). Today, BOINC and other projects enlist the help of millions of computers to attack problems ranging from chess to malaria.


* The Shockwave Rider also contains this fantastically succinct introduction to the paradox of orbital mechanics:

You are in circular orbit around a planet. You are being overtaken by another object, also in circular orbit, moving several km./sec. faster. You accelerate to try and catch up.

See you later, accelerator.

Much later.

The idea makes repeated appearances. Other wordings include “Go faster in order to drop back to a lower orbit? Doesn't work. Drop back to a lower orbit; you go faster!” and “The way to go faster is to slow down.”

Bourne Shell Macros

Steve Bourne* worked on an Algol 68 compiler at the Cambridge University Computer Lab before moving to Bell Labs, where he worked on Seventh Edition Unix. He is best known for writing V7's /bin/sh, now known as the Bourne shell.

Bourne's preference for Algol syntax is evident in shell scripts, which use Algol constructs like if ... then ... elif ... else ... fi and case ... esac instead of C equivalents that might be more familiar to Unix users. (The C shell and rc were both inspired, in part, by a desire for C-like syntax.)

Bourne's preference for Algol syntax is, surprisingly, also evident in the shell source. More comfortable with Algol but stuck with C, Bourne found a way to cope:

/usr/src/cmd/sh/mac.h:
#define IF      if(
#define THEN    ){
#define ELSE    } else {
#define ELIF    } else if (
#define FI      ;}

#define BEGIN   {
#define END     }
#define SWITCH  switch(
#define IN      ){
#define ENDSW   }
#define FOR     for(
#define WHILE   while(
#define DO      ){
#define OD      ;}
#define REP     do{
#define PER     }while(
#define DONE    );
#define LOOP    for(;;){
#define POOL    }

Using these macros, one can write programs that would certainly feel familiar to an Algol programmer:

/usr/src/cmd/sh/service.c:297:
LOCAL VOID      gsort(from,to)
        STRING          from[], to[];
{
        INT             k, m, n;
        REG INT         i, j;

        IF (n=to-from)<=1 THEN return FI

        FOR j=1; j<=n; j*=2 DONE

        FOR m=2*j-1; m/=2;
        DO  k=n-m;
            FOR j=0; j=0; i-=m
                DO  REG STRING *fromi; fromi = &from[i];
                    IF cf(fromi[m],fromi[0])>0
                    THEN break;
                    ELSE STRING s; s=fromi[m]; fromi[m]=fromi[0]; fromi[0]=s;
                    FI
                OD
            OD
        OD
}

(It's not clear why Bourne used braces around function bodies instead of BEGIN and END.)

The Bourne shell code, along with the 4.2BSD finger implementation, was the impetus for the International Obfuscated C Code Contest.

* Really, Stephen R. Bourne, but even then not quite specific enough. Years ago, I heard a story that at one point there were two Stephen R. Bournes working at SGI. Callers would ask the operator for Steve Bourne and would be asked, ‘which one?’ ‘Steven R. Bourne,’ they'd say. ‘Sure, which one?’ The ambiguity was apparently resolved by referring to them as ‘hardware Steve’ and ‘software Steve.’ I can find no justification for this story nor any trace of ‘hardware Steve’ other than the mention of two Steven R. Bournes in the sendmail README, which claims that they were both at Bell Labs, not SGI. If you can confirm or deny this story, please post a comment or mail me (rsc@swtch.com).

Absolute File System Design

Butler Lampson and Robert Sproull's 1979 SOSP paper “An Open Operating System for a Single-User Machine” (PDF; also HTML) describes the operating system on the Alto, a groundbreaking personal networked workstation built at Xerox PARC in the late 1970s and early 1980s. Many Alto features became everyday aspects of modern computing environments, but one that didn't catch on was the Alto's robust file system.

The Alto file system was designed to withstand disk or programming errors. If some set of disk blocks were lost, the Alto file system would only lose the data contained in those sectors. If you lose a data block from a file, you lose just that data block. If you lose a block in a directory, you lose the name for that file but not the file itself.

Section 3 of the paper gives the details. The basic technique is that each disk block contains a small header giving the id number and version of the associated file along with the block's offset in the file and the amount of data in the block (the last block in a file might contain less than a block worth of data). Lampson and Sproull refer to this information as absolute: it is the single point of truth for the block in question. A disk scavenger (like Unix's fsck) can scan the entire file system reading block labels to reconstruct the contents of any file.

Scanning the entire file system to find block contents is of course very slow. The block label also contains pointers to the previous and next block in a file, so that reading a file can just follow the block pointers (the file system layout algorithm arranged that much of the time files would be laid out sequentially so that following these pointers would not be too expensive in terms of disk seeks). These pointers are called hints, because they exist only for performance, not for correctness. If the hints are incorrect, that fact will become clear when they are used (the label of the pointed at block will not agree with expectations), and the scavenger can be invoked to fix the file system.

Contrast this scheme to inodes in a modern Unix file system (say, BSD FFS or Linux Ext2/3). If you lose the 512-byte disk block containing a 1GB file's inode, the entire file is inaccessible. No amount of scavenging can help, because the list of blocks making up the file was only contained in the inode, and the blocks themselves are not tagged with which file they belong to. This is much less robust than the Alto file system. Of course, the situation is even worse if you lose the 512-byte disk block containing the file system superblock: then you've lost the entire file system. Because the superblock is so crucial, file systems usually maintain backup copies of the superblock scattered across the disk. Inodes are almost as important but not so carefully guarded.

Why don't modern file systems don't use labels like the Alto did?

Disk geometry might play a role: it is hard to reconcile a (say) 32 byte label with (say) a 4096 byte data block. You put the label in the data block, cutting the actual data size to 4064 bytes. You could put the label in the sector before the block, taking up 512 bytes for a 32 byte label. Or you could write a label sector containing 16 labels every 16 data blocks. The third is probably the best option, but all of them are somewhat cumbersome. (But you'd lose less data!)

Another reason is historical: the original Unix file system didn't have any such mechanism, and FFS and Ext2/3 differ little from the original Unix file system.

Another reason might be philosophical: you're supposed to handle disk failures with backups, not clever file system tricks. This is the least useful argument: who wouldn't want a file system that needed to resort to backups less often?

Another reason might be pragmatic: maybe disks fail in different ways now than they did in 1979. (Of course, that doesn't explain the original Unix file system or FFS.) Since there's so little data on disk failures, it's hard to say anything for certain.

Whatever the arguments against labels, it is undeniable that they produced a very robust system. In fact, the operating system on the Alto could be almost completely swapped out an replaced with user programs, each of which could supply its own file system implementation. Over time, there came to be three file system implementations sharing the disk, one in BCPL, one in Mesa, and one in Smalltalk. In a retrospective paper, Lampson wrote:

If a disk address is wrong, the label check will detect the error, and various recovery mechanisms can be invoked. A Scavenger program, written by Jim Morris, takes about a minute to check or restore the consistency of an Alto file system; it can be routinely run by nonprofessional users. As a result of this conservative design, the file system has proved to be very reliably; loss of any information other than bits physically damaged on the disk is essentially unheard of. This reliability is achieved in spite of the fact that many programs besides the standard operating system have access to the disk.

Who wouldn't want a robust system like that? And why aren't more of today's file systems that robust?

B Languages

I enjoy reading about programming languages of the past. (I would probably enjoy reading about programming languages of the future too, but those are harder to find.) Today's links are a glimpse back into the programming world before types.

The predecessor of the C programming language was, of course, Ken Thompson's B, which was in turn influenced by Martin Richards's BCPL.*

Thompson's 1972 technical memorandum “Users' Reference to B” (PDF, 1MB; also HTML) describes B in detail. The most striking part about B is how similar it is to C. The fundamental difference is the lack of types, or rather the single type: the machine word. As Thompson puts it, “There is no check to insure that there are no type mismatches. Similarly, there are no wanted, or unwanted, type conversions.” To me, the lack of types gives B a flavor distinctly different from C. Having grown up programming with types, being able to dereference any value at all seems like not wearing a seat belt.

The BCPL manual is also online, courtesy of Dennis Ritchie. It is easy to see B as BCPL filtered through Thompson's minimalist sensibilities. In a 1989 interview, Thompson recounted, “[B] was very small, very clean. It was the same language as BCPL, it looked completely different, syntactically it was, you know, a redo. The semantics was exactly the same as BCPL. And in fact the syntax of it was, if you looked at, you didn’t look too close, you would say it was C. Because in fact it was C, without types.”

The B memo also contains other interesting trivia. Thompson defines the newfangled while loop in terms of goto, an indicator of just how uncommon ‘structured’ programming was at the time. The memo also contains what is probably the first published implementation of printf.

In a language whose only type is the machine word, character processing is particularly cumbersome. B included primitives char and lchar to fetch and store characters from character strings (vectors). On his BCPL history page linked above, Ritchie says that “C was invented in part to provide a plausible way of dealing with characters when one begins with a word-oriented language.”

B had the full suite of the =op operators (what became op= in modern C), not just for arithmetical operators like =+ and =- but also for comparison operators: ===, =!=, =<, =<=, =>, and =>=. Unfortunately, the end of the manual notes that none of these comparison operators were actually implemented, nor were =/ and =&.

Brian Kernighan's 1973 B tutorial (PDF; also HTML) contains what is probably the very first “hello, world” program. (See page 4 of the PDF.)


* There is an old story that at Bell Labs there were arguments over whether the successor to C should be called D (following the alphabet) or P (following the letters in BCPL) and that Bjarne Stroustroup ended the argument by choosing the name C++ instead. In Larry Wall's 1999 talk “Perl, the first postmodern computer language,” he said that Perl was the true successor to C, having taken both of the remaining letters from BCPL as its conventional script extension.

The Discovery of Debugging

Debugging was a surprise. When the early computer pioneers built the first programmable computers, they assumed that writing programs would be straightforward: think hard, write the program, done.

Maurice Wilkes, creator of the EDSAC, the first stored-program computer, wrote what might be the first programming textbook in 1951, with David Wheeler (inventor of the subroutine call!) and Stanley Gill. It warns: “Experience has shown that such mistakes are much more difficult to avoid than might be expected. It is, in fact, rare for a program to work correctly the first time it is tried, and often several attempts must be made before all errors are eliminated.”

In his memoir, Wilkes recalled the exact moment he realized the importance of debugging: “By June 1949, people had begun to realize that it was not so easy to get a program right as had at one time appeared. It was on one of my journeys between the EDSAC room and the punching equipment that the realization came over me with full force that a good part of the remainder of my life was going to be spent in finding errors in my own programs.”

Brian Hayes's excellent 1993 essay “The Discovery of Debugging” (skip to page 32) describes the history in detail.

Martin Campbell-Kelly's 1992 essay “The Airy Tape: An Early Chapter in the History of Debugging” (subscription required) gives more details on one of the stories in Hayes's article: the successful debugging, after 41 years, of the first hard bug. The bug? A single-precision floating-point value used in a double-precision context.

Hayes's essay itself contains a bug worth mentioning. It says that the first three programs run on the EDSAC ran correctly the first time, an understandable but incorrect inference from Campbell-Kelly's article. In fact the second program ever run, which merely printed primes, was buggy. The EDSAC log for May 7, 1949 reads “Table of primes attempted — programme incorrect.” A correct primes program didn't run until two days later.