Decidability and the Halting Problem

Here is a piece of software everyone would pay for: a perfect bug-checker. You feed it any program together with any input, and it tells you — reliably, in finite time — whether that program will eventually finish or spin forever in an infinite loop. No more hung servers, no more frozen apps. Surely, with enough cleverness, someone could write it?

No. And this isn't a "we haven't managed it yet" — it is a theorem. In 1936 Alan Turing proved that no algorithm can decide, for every program-and-input pair, whether it halts. The universal bug-checker is not merely hard to build; it is mathematically impossible. This page is about that boundary — the line between problems a machine can settle and problems no machine ever can — and about the beautiful self-referential trick that proves the line is real.

Decision problems and deciders

A decision problem is a yes/no question posed about an input: "Is n prime?", "Does this graph have a cycle?", "Does this program halt on this input?". An algorithm — think of a Turing machine, or any program in your favourite language — is a decider for that problem if, on every input, it eventually stops and outputs the correct yes or no.

The two words are everything:

A problem is decidable when some algorithm decides it — always halts, always correct. "Is n prime?" is decidable: trial division always finishes.
A problem is undecidable when no such algorithm exists — not now, not ever, no matter how clever the programmer or how fast the machine.

Undecidable is a statement about the problem, not about us. It says the set of correct yes/no answers is so tangled that no finite mechanical procedure can reproduce it.

The Halting Problem

Consider the decision problem HALT: given the source code of a program p and an input x, does p halt when run on x?
There is no algorithm \text{halts}(p, x) that decides HALT for every pair (p, x).
That is, HALT is undecidable.

Notice what is not being claimed. For particular programs you can often see the answer at a glance — a program with no loops obviously halts; a bare while (true) {} obviously does not. The theorem is about a single algorithm that must handle all programs at once. That one universal decider is what cannot exist.

See the diagonal move

Line the programs up as rows and their possible inputs as columns; cell (i, j) records whether program i halts (H) or loops (L) on input j. Turing's D is read straight off the diagonal and then flipped — so its row of behaviour differs from every program in the table, including itself. Step through it:

Turing's diagonal argument, step by step

The proof is a proof by contradiction driven by self-reference. We assume the impossible tool exists and then build a program that jams it.

Assume a decider exists. Suppose there is a program \text{halts}(p, x) that always halts and returns true exactly when program p halts on input x, and false otherwise.
Build a troublemaker. Using it, write a new program D that takes one input: the source code of a program p. Inside, D asks \text{halts}(p, p) — "does p halt when fed its own code?" — and then does the opposite: if the answer is "halts", D loops forever; if the answer is "loops", D halts.
Feed D its own code. Now run D on the source of D itself. Ask the fatal question: does D(D) halt?
- If D(D) halts, then by its own construction \text{halts}(D, D) returned "loops" — so D(D) should not halt. Contradiction.
- If D(D) loops, then \text{halts}(D, D) returned "halts" — so D(D) should have halted. Contradiction.
Collapse. Either way we reach a contradiction. The only assumption we made was that \text{halts} exists — so it cannot. \blacksquare

The self-reference — a program reasoning about a program reasoning about itself — is exactly Cantor's diagonalisation that shows the real numbers are uncountable. There you list candidate reals in a table and build a new one by flipping the diagonal digit so it differs from every listed number. Here you list programs in a table and build D by flipping the diagonal entry so its behaviour differs from every listed program. Same devastating trick, wearing two hats.

The contradiction in code

Here is the troublemaker as illustrative — and deliberately un-runnable — TypeScript. It cannot actually run, precisely because the \text{halts} it leans on cannot exist:

// SUPPOSE this perfect decider existed (it does not): declare function halts(program: string, input: string): boolean; // The troublemaker. It receives a program's source code as its input. function D(p: string): void { if (halts(p, p)) { while (true) {} // if p halts on its own code, D loops forever } else { return; // if p loops on its own code, D halts } } // Now feed D its OWN source code, held in dSource: D(dSource); // halts(dSource, dSource) === true => D runs while(true) => D does NOT halt // halts(dSource, dSource) === false => D returns => D DOES halt // In both cases halts() gave the wrong answer. So halts() cannot exist. QED

The whole proof lives in that one if: whatever halts predicts about D(D), the program is written to do the opposite. A perfect predictor of behaviour cannot survive a program built to spite it.

Decidable, semi-decidable, undecidable

Between "solvable" and "hopeless" sits a crucial middle rung. A problem is semi-decidable (equivalently recursively enumerable, r.e.) when there is an algorithm that halts and says "yes" on every yes-instance, but on a no-instance is allowed to run forever. You get a guaranteed yes; you never get a guaranteed no.

The halting problem itself is semi-decidable: to check whether p halts on x, just run it. If it halts, you find out — "yes". But if it is going to loop forever, your simulation loops forever too, and you never get to say "no". Semi-decidable is strictly weaker than decidable. A problem is decidable exactly when both it and its complement are semi-decidable.

Decidable: an algorithm always halts and outputs the correct yes/no.
Semi-decidable (r.e.): an algorithm halts with "yes" on yes-instances but may run forever on no-instances.
Undecidable: no algorithm decides it. (It may still be semi-decidable — as the halting problem is.)

Back to logic: the Entscheidungsproblem

Turing did not build his machine to check software — he built it to answer a question in logic. In 1928 Hilbert and Ackermann posed the Entscheidungsproblem ("decision problem"): is there an algorithm that, given any sentence of first-order logic, decides whether it is logically valid (true in every model)?

In 1936 Church and Turing answered: no — first-order validity is undecidable. There is no mechanical procedure that always terminates with a correct valid/not-valid verdict. This is where the halting problem and pure logic meet: Turing reduced HALT to validity, so a validity-decider would give a halting-decider, which we just proved cannot exist.

But — and this is the elegant consolation — first-order validity is semi-decidable. By Gödel's completeness theorem, a formula is valid if and only if it has a formal proof, and proofs are finite objects you can enumerate systematically. So a proof search that grinds through every candidate derivation will find a proof of any valid formula and halt saying "valid". Fed an invalid formula, though, it may search forever, never certain none exists. Validity is recognisable but not decidable — exactly the semi-decidable middle rung.

The single most common misreading: hearing "undecidable" as "really difficult" or "nobody has found the algorithm yet". Both are wrong. Undecidable is a proven impossibility — a theorem that no algorithm can ever exist, however much compute, cleverness, or time you throw at it. It is not a challenge awaiting a genius; it is a closed door with a proof nailed to it.

And do not blur semi-decidable with decidable. Being able to confirm every "yes" is not the same as being able to settle every case. The halting problem and first-order validity are both semi-decidable and both undecidable at the same time — you can recognise the yes-instances, yet no procedure resolves every instance with a guaranteed answer.

It doesn't — not perfectly. Real tools sidestep the theorem by being conservative: they answer only "definitely halts", "definitely loops", or a humble "I'm not sure". A linter that flags some infinite loops while missing others, or occasionally cries wolf on safe code, breaks no law — it simply never claims to be the perfect universal decider. Type checkers, termination analysers, and model checkers all live in this pragmatic world: correct when they commit, silent when they can't. The impossibility is only of a tool that is always right on every program. Approximate, one-sided, or "usually right" tools are alive and well — and are why your IDE can still be useful.