Showing posts with label undecidability. Show all posts
Showing posts with label undecidability. Show all posts

[ next | prev | up ]

Bigger Programs are Better, not Best

When computer scientists study algorithms, they are interested in how much time or working space an algorithm requires, as a function of the input size: quicksort uses O(n log n) time, but only O(n) space. For the most part, computer scientists don't study how much code a problem requires. In fact, part of what it means to be an algorithm is that it can be implemented as a fixed-size program that works regardless of the input size. Intuitively, though, bigger programs should be able to do more than smaller programs.

It is pretty easy to prove that bigger programs are better, or at least that they can do more. To start, we need to precisely define what we mean by the size a program. These kinds of discussions often use Turing machines, but these days not many people are comfortable with Turing machines, so I'm going to use Scheme. [؟]

Let's define a Scheme program's size to be the number of left parentheses it contains. We need to disallow arbitrarily large parenthesis-free expressions, so that there are only a finite number of Scheme programs of a given size. Let's limit functions to three arguments and integer literals to be numbers less than 100. These requirements might make Scheme a little more verbose, but they don't make it any less powerful. Notice that these programs are just expressions that don't take any input.

If we look at all the Scheme programs of a given size, some of those programs evaluate to integers: (+ 1 (+ 2 3)) has size 2 and evaluates to 6. Let's define B(n) to be the largest integer that can be produced by evaluating a Scheme program of size n. The example demonstrates that B(3) is at least 6.

The formal statement of the “bigger is better” assertion is that B(n+1) > B(n) for any n. For any given n, the definition of B requires that there be some Scheme program e of size n that evaluates to B(n). Then (+ 1 e) has size n+1 and evaluates to 1+B(n). This demonstrates that B(n+1) is at least 1+B(n), so B(n+1) > B(n).

There you have it. Bigger programs are better. But it turns out that no matter how big a program you write, it can't compute B(n). B(n) gets so big so fast that no fixed-size program can keep up. We can prove this too.

Consider any Scheme function f that takes a single integer argument. This is a lambda expression, not a standalone program like we've been considering, but it still has some size s. Define n = 2s+1. We can write a Scheme program n1, also of size s, that computes 2s+2 = n+1. The program looks like (+ 2 (+ 2 ... (+ 2 (+ 2 2)) ...)). Now consider the Scheme program (f n1), wihch has size s+s+1 = n. Since it has size n, it cannot compute a number bigger than B(n). The value of the argument to f is n+1, and B(n+1) > B(n), so f cannot be computing B. Essentially, B grows faster than any function you can implement in Scheme.*

There you have it. The function B cannot be computed. Bigger has bested the computer.

The original presentation of this result is Tibor Rado's 1962 paper “On Non-Computable Functions” (PDF, 320kB), which appeared in the Bell System Technical Journal. Rado used Turing machines, not Scheme programs, and called the function Σ(n), not B(n). He also defined a function S(n) which is the maximum length of any computation by a Turing machine of size n that eventually stops. Rado made a game of trying to make Turing machines of a given size that ran for as long as possible (but stopped), and he called it the “Busy Beaver game.”

The function Σ is sometimes called the Busy Beaver function, leading to a slew of papers by otherwise respectable computer scientists and mathematicians about busy beavers. In particular, a favorite pastime is to attempt to compute Σ(n) and S(n) by hand, for small values of n. This requires analyzing all possible Turing machines of the given size to figure out whether each eventually stops, and if so, how long it runs and what number it prints.

Using a computer to analyze the easy machines and then doing the hard ones by hand, Rado and his student Shen Lin proved that Σ(1) = 1, S(1) = 1, Σ(2) = 4, and S(2) = 6 in their 1965 paper “Computer Studies of Turing Machine Problems” (subscription required). Lin proved in his Ph.D. thesis that Σ(3) = 6 and S(3) = 21. In 1983, Allen Brady proved that Σ(4) = 13 and S(4) = 107.

Rona Machlin and Quentin Stout summarized the situation nicely in their 1990 paper “The Complex Behavior Of Simple Machines:”

Brady predicted that there will never be a proof of the values of Σ(5) and S(5). We are just slightly more optimistic, and are lead to recast a parable due to Erdös (who spoke in the context of determining Ramsey numbers): suppose a vastly superior alien force lands and announces that they will destroy the planet unless we provide a value of the S function, along with a proof of its correctness. If they ask for S(5) we should put all of our mathematicians, computer scientists, and computers to the task, but if they ask for S(6) we should immediately attack because the task is hopeless.

Michiel Wijers has a good bibliography. Allen Brady has recently been exploring ternary Turing machines.


* It doesn't matter that we used Scheme and Rado used Turing machines, because Scheme can simulate a Turing machine and vice versa. In fact, so far no feasible computational model has yet been found that can't be simulated by a Turing machine (or, by extension, by Scheme). The Church-Turing thesis is the name for the hypothesis that anything we would reasonably consider computable can be computed by a Turing machine (or lambda calculus or a general-recursive function, both of which are equivalent). The exact details of what you can do with a given size differs from model to model, but the net result is the same: even a Turing machine can't compute our Scheme-based B(n). The result would work for any modern programming language, but Scheme makes the proofs particularly elegant.

Traveling Passenger Problems

(Today's late post is brought to you by Continental Airlines and the letters E, W, and R.)

ITA Software's flagship product, QPX, is an airline ticket search engine. You tell it where you want to fly and when, and it comes back with possible trips at many different prices. QPX powers Orbitz, CheapTickets, Hotwire, and a half dozen airline company sites.

The airlines want to charge as much as possible for each ticket they sell, but still fill each plane. But the airlines can't just ask possible customers how much money they have to spend, and they also can't explicitly set different prices based on who is buying the ticket. So they attach restrictions to less expensive fares, hoping that business travelers won't accept the restrictions and will pay for the more expensive tickets. The most well-known example is that trips that include a Saturday night are cheaper. There are also changes in ticket prices based on the rest of the trip: a one-way ticket from Boston to Newark might cost $349, but a one-way ticket from Boston to Raleigh via the very same Newark flight only $145*. Of course, if you make a habit of buying multi-leg tickets and skipping the connecting flights, the airlines might blacklist you (or so a travel agent once told me).

These restrictions make QPX's job harder than you might think. In fact, the general case is undecidable!

ITA's Chief Scientist, Carl de Marcken, gives a talk titled “Computational Complexity of Airline Travel&rdquo (HTML, also PDF), in which he proves that:

1. Even if the airlines publish only a single fare (with every ticket a single one way PU), and all the airports in a passenger's itinerary are fixed, so that the only remaining choice is what flights to use between the consecutive airports, deciding whether there is a valid ticket is NP-hard.

2. Fixing the flights and priceable units, but leaving open the choice of fares to pay for each flight, deciding whether there is a valid ticket is NP-hard.

3. Removing bounds on the size of solutions, the general question of whether there is a ticket for a query is EXPSPACE-hard. That is, air travel planning is at least as hard (it might be harder) as deciding whether a computer program that can use space exponentially bigger than the input halts. EXPSPACE-hard problems are (thought to be) much harder than NP-complete problems like the traveling salesman problem, and even much harder than PSPACE-complete problems like playing games optimally. There is no practical hope that computers will ever be able to solve EXPSPACE-hard problems perfectly, even if quantum computing becomes a reality.

4. The final result shows that just finding out whether there is a valid solution for a query is actually harder than EXPSPACE-complete: it is unsolvable (undecidable). The question of whether a valid ticket exists can not be solved for all databases and all queries no matter how long a computer thinks. However the full proof of this result is considerably more complex than the EXPSPACE-hard proof without offering any greater understanding of the problem.

The NP-hardness proofs reduce 3-color, the EXPSPACE-completeness proof encodes a Turing machine in ticket restrictions, and the undecidability proof encodes Diophantine equations. Here's a 3x3-bit binary multiplier. (Be sure to read the early slides too, which give necessary details about ticket pricing.)


* As of February 13, Continental flights on March 12, 2008 had those prices. Actual fares may no longer match.

Traffic Lights and Buridan's Ass

Buridan's ass is a donkey in a thought experiment proposed by French philosopher Jean Buridan. Placed exactly halfway between two bales of hay, the purely rational, deterministic donkey starves to death trying to decide which to eat: there is no reason to prefer one bale over the other.

Arbiters, electronic circuits that decide whether a particular voltage is a binary 0 or 1, have the same problem: voltages in the middle can force arbitrarily long decision times. This is known as the arbiter problem or glitch phenomenon.

Leslie Lamport observed that the arbiter problem also explains human indecisiveness at critical moments. His paper, “Buridan's Principle,” attempts to make the phenomenon known outside of computer science. Lamport explains the paper's fate:

I have observed that the arbiter problem occurs in daily life. Perhaps the most common example is when I find myself unable to decide for a fraction of a second whether to stop for a traffic light that just turned yellow or to go through. I suspect that it is actually a cause of serious accidents, and that people do drive into telephone poles because they can't decide in time whether to go to the left or the right.

A little research revealed that psychologists are totally unaware of the phenomenon. I found one paper in the psychology literature on the time taken by subjects to choose between two alternatives based on how nearly equal they were. The author's theoretical calculation yielded a formula with a singularity at zero, as there should be. He compared the experimental data with this theoretical curve, and the fit was perfect. He then drew, as the curve fitting the data, a bounded continuous graph. The singularity at zero was never mentioned in the paper.

I feel that the arbiter problem is important and should be made known to scientists outside the field of computing. So I wrote this paper, which describes the problem in its classical formulation as the problem of Buridan's ass—an ass that starves to death because it is placed equidistant between two bales of hay and has no reason to prefer one to the other. Philosophers have discussed Buridan's ass for centuries, but it apparently never occurred to any of them that the planet is not littered with dead asses only because the probability of the ass being in just the right spot is infinitesimal.

So, I wrote this paper for the general scientific community. I probably could have published it in some computer journal, but that wasn't the point. I submitted it first to Science. The four reviews ranged from “This well-written paper is of major philosophical importance” to “This may be an elaborate joke.” One of the other reviews was more mildly positive, and the fourth said simply “My feeling is that it is rather superficial.” The paper was rejected.

Some time later, I submitted the paper to Nature. I don't like the idea of sending the same paper to different journals hoping that someone will publish it, and I rarely resubmit a rejected paper elsewhere. So, I said in my submission letter that it had been rejected by Science. The editor read the paper and sent me some objections. I answered his objections, which were based on reasonable misunderstandings of the paper. In fact, they made me realize that I should explain things differently for a more general audience. He then replied with further objections of a similar nature. Throughout this exchange, I wasn't sure if he was taking the matter seriously or if he thought I was some sort of crank. So, after answering his next round of objections, I wrote that I would be happy to revise the paper in light of this discussion if he would then send it out for review, but that I didn't want to continue this private correspondence. The next letter I received was from another Nature editor saying that the first editor had been reassigned and that he was taking over my paper. He then raised some objections to the paper that were essentially the same as the ones raised initially by the first editor. At that point, I gave up in disgust.

I still think that this paper is worth publishing for a general scientific audience. Among other things, it has a nice analysis of a quantum-mechanical arbiter. However, I have no idea where to publish it.

My problems in trying to publish this paper are part of a long tradition. According to one story I've heard (but haven't verified), someone at G. E. discovered the phenomenon in computer circuits in the early 60s, but was unable to convince his managers that there was a problem. He published a short note about it, for which he was fired. Charles Molnar, one of the pioneers in the study of the problem, reported the following in a lecture given on February 11, 1992, at HP Corporate Engineering in Palo Alto, California:

One reviewer made a marvelous comment in rejecting one of the early papers, saying that if this problem really existed it would be so important that everybody knowledgeable in the field would have to know about it, and “I'm an expert and I don't know about it, so therefore it must not exist.”

Lamport's publications page is full of interesting papers.

Scooping the Loop Snooper

an elementary proof of the undecidability of the halting problem
by Geoffrey K. Pullum
Mathematics Magazine, October 2000

No program can say what another will do.
Now, I won't just assert that, I'll prove it to you:
I will prove that although you might work til you drop,
you can't predict whether a program will stop.

Imagine we have a procedure called P
that will snoop in the source code of programs to see
there aren't infinite loops that go round and around;
and P prints the word “Fine!” if no looping is found.

You feed in your code, and the input it needs,
and then P takes them both and it studies and reads
and computes whether things will all end as they should
(as opposed to going loopy the way that they could).

Well, the truth is that P cannot possibly be,
because if you wrote it and gave it to me,
I could use it to set up a logical bind
that would shatter your reason and scramble your mind.

Here's the trick I would use – and it's simple to do.
I'd define a procedure – we'll name the thing Q –
that would take a program we will call P (of course!)
to tell if it looped, by reading the source;

And if so, Q would simply print “Loop!” and then stop;
but if no, Q would go right back to the top,
and start off again, looping endlessly back,
til the universe dies and is frozen and black.

And this program called Q wouldn't stay on the shelf;
I would run it, and (fiendishly) feed it itself.
What behaviour results when I do this with Q?
When it reads its own source, just what will it do?

If P warns of loops, Q will print “Loop!” and quit;
yet P is supposed to speak truly of it.
So if Q's going to quit, then P should say, “Fine!” –
which will make Q go back to its very first line!

No matter what P would have done, Q will scoop it:
Q uses P's output to make P look stupid.
If P gets things right then it lies in its tooth;
and if it speaks falsely, it's telling the truth!

I've created a paradox, neat as can be –
and simply by using your putative P.
When you assumed P you stepped into a snare;
Your assumptions have led you right into my lair.

So, how to escape from this logical mess?
I don't have to tell you; I'm sure you can guess.
By reductio, there cannot possibly be
a procedure that acts like the mythical P.

You can never discover mechanical means
for predicting the acts of computing machines.
It's something that cannot be done. So we users
must find our own bugs; our computers are losers!

If you ever took a theory of computation course and had to write undecidability proofs, at least they didn't have to rhyme.