Showing posts with label debugging. Show all posts
Showing posts with label debugging. Show all posts

[ next | prev | up ]

The Discovery of Debugging

Debugging was a surprise. When the early computer pioneers built the first programmable computers, they assumed that writing programs would be straightforward: think hard, write the program, done.

Maurice Wilkes, creator of the EDSAC, the first stored-program computer, wrote what might be the first programming textbook in 1951, with David Wheeler (inventor of the subroutine call!) and Stanley Gill. It warns: “Experience has shown that such mistakes are much more difficult to avoid than might be expected. It is, in fact, rare for a program to work correctly the first time it is tried, and often several attempts must be made before all errors are eliminated.”

In his memoir, Wilkes recalled the exact moment he realized the importance of debugging: “By June 1949, people had begun to realize that it was not so easy to get a program right as had at one time appeared. It was on one of my journeys between the EDSAC room and the punching equipment that the realization came over me with full force that a good part of the remainder of my life was going to be spent in finding errors in my own programs.”

Brian Hayes's excellent 1993 essay “The Discovery of Debugging” (skip to page 32) describes the history in detail.

Martin Campbell-Kelly's 1992 essay “The Airy Tape: An Early Chapter in the History of Debugging” (subscription required) gives more details on one of the stories in Hayes's article: the successful debugging, after 41 years, of the first hard bug. The bug? A single-precision floating-point value used in a double-precision context.

Hayes's essay itself contains a bug worth mentioning. It says that the first three programs run on the EDSAC ran correctly the first time, an understandable but incorrect inference from Campbell-Kelly's article. In fact the second program ever run, which merely printed primes, was buggy. The EDSAC log for May 7, 1949 reads “Table of primes attempted — programme incorrect.” A correct primes program didn't run until two days later.

Debugging the Universe

Every programmer knows what debugging is. Given a program that isn't behaving as expected, you slowly refine your understanding of both the program and the anomalous behavior until you understand exactly why the two aren't in agreement. Sometimes the bug is in the program, other times in your understanding of the program. Then you fix it.

Physicists debug the universe. When the universe doesn't behave as expected, they debug it, trying to reconcile their understanding of the universe and what they are seeing. The difference is that the bug, by definition, is always in their understanding and never in the universe.

During the development of the Global Positioning System (GPS), the physicists and programmers had to debug general relativity. To tell the story, you need to know a tiny bit about how GPS works.

Each GPS satellite broadcasts a known pseudo-random number sequence. A simple GPS receiver has its own pseudo-random sequence generator that is synced with the satellites. By timing how far “behind” the satellite sequences appear to be compared to the receiver's time, the receiver can determine how far away they are. Using the known positions of and distances to three satellites, a GPS receiver can triangulate its position in three-dimensional space. If the GPS receiver's clock is not exactly in sync with the satellites, it can use readings from four satellites to triangulate its position in four-dimensional space-time, synchronizing its clock in the process.

All this assumes that the clocks in the satellites are running at the same speed as the clocks on the Earth, but the satellites are literally running circles around the Earth; at those speeds, relativity kicks in and unexpected behaviors emerge. This was actually something the GPS engineers had to consider. Peter Galison tells the story better than I can:

According to relativity, satellites that were orbiting the earth at 12,500 miles per hour ran their clocks slow (relative to the earth) by 7 millionths of a second per day. Even general relativity (Einstein's theory of gravity) had to be programmed into the system. Eleven thousand miles in space, where the satellites orbited, general relativity predicted that the weaker gravitational field would leave the satellite clocks running fast (relative to the earth's surface) by 45 millionths of a second per day. Together, these two corrections add up to a staggering correction of 38 millionths (that is, 38,000 billionths) of a second per day in a GPS system that had to be accurate to within 50 billionths of a second each day. Before the first cesium atomic clock launch in June 1977, some GPS engineers were sufficiently dubious about these enormous relativistic effects to insist that the satellite's atomic clock broadcast its time “raw.” Its relativity-correction mechanism idled onboard. Down came the signal, running fast over the first twenty-four hours almost precisely by the predicted 38,000 billionths of a second. After twenty days of such gains, ground control commanded the frequency synthesizer to activate, correcting the broadcast time signal. Without that relativistic correction, it would have taken less than two minutes for the GPS system to exceed its allowable [daily] error.


(From Peter Galison, Einstein's Clocks, Poincaré's Maps pp. 288-289.)

I heartily recommend Galison's book, a history of the development of the physical concept of time throughout the twentieth century. Galison has doctorates in both physics and the history of science; using his dual expertise he makes the material accessible to dual laymen.

For a more technical account, the book's endnotes cite Neil Ashby, “General Relativity in the Global Positioning System