NP-Completeness and Reductions

Imagine you are handed a thousand different puzzles — scheduling exams so no student has two at once, colouring a map, packing a delivery truck, finding a route past every warehouse. Each looks unrelated, and for each one your best algorithm slows to a crawl as the input grows. Now imagine someone whispers: they are all secretly the same puzzle in disguise. Crack one efficiently and you crack them all; prove one is hopeless and they all are.

That astonishing claim is the heart of NP-completeness. Sitting one rung up from , this page is about the tool that ties those puzzles together — the polynomial-time reduction — and the class of problems it reveals to be the very hardest in \mathrm{NP}.

A reduction is a translator

A polynomial-time reduction from problem A to problem B — written A \le_p B — is a fast (polynomial-time) procedure that rewrites any instance of A into an instance of B that has the same yes/no answer. The kind we use here is the many-one (Karp) reduction: one call to a solver for B, and its answer is the answer for A.

Read the arrow out loud as a promise about efficiency:

A \le_p B \quad\Longrightarrow\quad \text{``if I can solve } B \text{ efficiently, I can solve } A \text{ efficiently.''}

The recipe: given an instance x of A, run the translator to get f(x), feed f(x) to your B-solver, and return whatever it says. Because f runs in polynomial time and preserves the answer, a fast solver for B becomes a fast solver for A. In this precise sense B is at least as hard as A — hardness flows backward along the arrow.

NP-hard and NP-complete

Reductions let us name the hardest problems in \mathrm{NP} without ever measuring their running time directly — we measure them against each other.

A problem B is NP-hard if every problem A \in \mathrm{NP} reduces to it: A \le_p B for all such A. It is at least as hard as everything in NP — but it need not itself be in NP.
A problem is NP-complete if it is both in NP and NP-hard. These are the hardest problems that still live inside NP — the "hardest of the checkable".
The punchline: if any single NP-complete problem turns out to be in \mathrm{P}, then every problem in NP is in P, and so \mathrm{P} = \mathrm{NP}. One efficient algorithm would topple the whole row of dominoes.

Picture NP-complete problems as a plateau at the very top of NP, all equivalent to one another under reduction. Nudge any one of them down into P and the entire plateau collapses down with it.

Cook–Levin: the very first NP-complete problem

The definition of NP-hard has a chicken-and-egg problem: to prove something NP-hard you must reduce all of NP to it — but there are infinitely many problems in NP. The breakthrough of Stephen Cook and (independently) Leonid Levin around 1971 was to do this once, for one carefully chosen problem, and thereby unlock all the rest.

Boolean satisfiability (SAT) — deciding whether a Boolean formula can be made true by some assignment of \text{true}/\text{false} to its variables — is NP-complete.
It is in NP (a satisfying assignment is a certificate you can check instantly), and it is NP-hard: the computation of any polynomial-time verifier can be encoded as a Boolean formula that is satisfiable exactly when the verifier would accept.
SAT is therefore the first NP-complete problem — the anchor from which every other NP-completeness result is grown.

The idea (no formal proof here) is that a Turing machine's step-by-step run is a rigid, local, mechanical process — and anything that local can be written as a giant AND of little logical constraints: "cell i holds symbol s at time t", "the head moves left here", "the machine ends in an accepting state". The whole formula is satisfiable precisely when the machine has an accepting run. Computation is a satisfiability question in disguise.

Spreading the hardness

Once we have one NP-hard problem, reductions do the rest — because reductions compose. This is transitivity:

A \le_p B \ \text{ and } \ B \le_p C \quad\Longrightarrow\quad A \le_p C.

So to prove a brand-new problem C is NP-hard, you do not reduce all of NP to it again. You just take one problem already known to be NP-hard — say SAT — and reduce it to C. Since everything in NP reduces to SAT, and SAT reduces to C, everything in NP reaches C by transitivity. That single arrow carries the hardness of the entire class.

This is exactly how the family tree grew. SAT reduces to 3-SAT (every clause squeezed to exactly three literals), and 3-SAT — clean and rigid — is the springboard to graph problems like , finding a large clique, or a Hamiltonian path. Reveal the arrows below one at a time:

Every arrow is a polynomial-time reduction pointing from an easier-to-reason-about source to a target we want to prove hard. Follow any arrow forward and you carry hardness with you; the whole diagram is rooted at SAT because Cook–Levin planted it there.

A reduction up close: 3-SAT ≤ₚ 3-Colouring

Here is the flavour of one real reduction, at the "gadget" level. We want to turn any 3-SAT formula into a graph that is 3-colourable exactly when the formula is satisfiable. We build the graph out of little reusable widgets called gadgets:

A palette triangle. Three special nodes T, F, B ("true", "false", "base") joined in a triangle, so in any 3-colouring they must take three different colours — this fixes what "true" and "false" mean as colours.
Variable gadgets. For each variable x, a pair of nodes x and \lnot x joined to each other and to B. They are forced to take the true and false colours in some order — encoding a truth value.
Clause gadgets. A small OR-gadget wired to a clause's three literals that can be 3-coloured if and only if at least one of those literals got the "true" colour.

Bolt the gadgets together and the finished graph is 3-colourable precisely when there is a truth assignment satisfying every clause. Crucially, the graph's size is polynomial in the formula's size and it is built by a simple mechanical pass — so this genuinely is a polynomial-time reduction \text{3-SAT} \le_p \text{3-Colouring}, and graph 3-colouring inherits 3-SAT's hardness.

Because a reduction is a strategic move, not a defeat. If you can't find a fast algorithm for your problem, proving it NP-complete tells you nobody else will either (unless \mathrm{P} = \mathrm{NP}) — so you can stop hunting for a perfect algorithm and switch to approximations, heuristics, or special cases with a clear conscience. Reductions also run the other way in practice: modern SAT solvers are so good that engineers reduce their scheduling, verification and planning problems to SAT and let a battle-tested solver do the work. The theory that says "these are all the same problem" becomes an engineering shortcut.

Reductions have a direction, and it is easy to get backwards. To prove your problem B is NP-hard, you must reduce a known NP-hard problem (like SAT) to B — that is \text{SAT} \le_p B, source SAT, target B.

Doing it the other way, B \le_p \text{SAT}, proves the opposite: it shows B is no harder than SAT (in fact, since SAT is in NP, it just shows B \in \mathrm{NP}). A handy mantra: to show something is hard, reduce hardness into it — point the arrow at the problem you're accusing, from a villain you already know is guilty.