NP-Completeness and Reductions
Imagine you are handed a thousand different puzzles — scheduling exams so no student has two at once,
colouring a map, packing a delivery truck, finding a route past every warehouse. Each looks unrelated,
and for each one your best algorithm slows to a crawl as the input grows. Now imagine someone whispers:
they are all secretly the same puzzle in disguise. Crack one efficiently and you crack them
all; prove one is hopeless and they all are.
That astonishing claim is the heart of NP-completeness. Sitting one rung up from
,
this page is about the tool that ties those puzzles together — the polynomial-time
reduction — and the class of problems it reveals to be the very hardest in
\mathrm{NP}.
A reduction is a translator
A polynomial-time reduction from problem A to problem
B — written A \le_p B — is a fast
(polynomial-time) procedure that rewrites any instance of A into an
instance of B that has the same yes/no answer. The kind we
use here is the many-one (Karp) reduction: one call to a solver for
B, and its answer is the answer for A.
Read the arrow out loud as a promise about efficiency:
A \le_p B \quad\Longrightarrow\quad \text{``if I can solve } B \text{ efficiently, I can solve } A \text{ efficiently.''}
The recipe: given an instance x of A, run the
translator to get f(x), feed f(x) to your
B-solver, and return whatever it says. Because
f runs in polynomial time and preserves the answer, a fast solver for
B becomes a fast solver for A. In this precise
sense B is at least as hard as
A — hardness flows backward along the arrow.
NP-hard and NP-complete
Reductions let us name the hardest problems in \mathrm{NP} without ever
measuring their running time directly — we measure them against each other.
-
A problem B is NP-hard if every problem
A \in \mathrm{NP} reduces to it:
A \le_p B for all such A. It is at least as
hard as everything in NP — but it need not itself be in NP.
-
A problem is NP-complete if it is both in NP and NP-hard. These
are the hardest problems that still live inside NP — the "hardest of the checkable".
-
The punchline: if any single NP-complete problem turns out to be in
\mathrm{P}, then every problem in NP is in P, and so
\mathrm{P} = \mathrm{NP}. One efficient algorithm would topple the whole
row of dominoes.
Picture NP-complete problems as a plateau at the very top of NP, all equivalent to one another under
reduction. Nudge any one of them down into P and the entire plateau collapses down with it.
Cook–Levin: the very first NP-complete problem
The definition of NP-hard has a chicken-and-egg problem: to prove something NP-hard you must reduce
all of NP to it — but there are infinitely many problems in NP. The breakthrough of Stephen
Cook and (independently) Leonid Levin around 1971 was to do this once, for one carefully chosen
problem, and thereby unlock all the rest.
-
Boolean satisfiability (SAT) — deciding whether a Boolean formula can be made true
by some assignment of \text{true}/\text{false} to its variables — is
NP-complete.
-
It is in NP (a satisfying assignment is a certificate you can check instantly), and it is NP-hard:
the computation of any polynomial-time verifier can be encoded as a Boolean formula that is
satisfiable exactly when the verifier would accept.
-
SAT is therefore the first NP-complete problem — the anchor from which every other
NP-completeness result is grown.
The idea (no formal proof here) is that a Turing machine's step-by-step run is a rigid, local, mechanical
process — and anything that local can be written as a giant AND of little logical constraints: "cell
i holds symbol s at time t",
"the head moves left here", "the machine ends in an accepting state". The whole formula is satisfiable
precisely when the machine has an accepting run. Computation is a satisfiability question in
disguise.
Spreading the hardness
Once we have one NP-hard problem, reductions do the rest — because reductions
compose. This is transitivity:
A \le_p B \ \text{ and } \ B \le_p C \quad\Longrightarrow\quad A \le_p C.
So to prove a brand-new problem C is NP-hard, you do not reduce all
of NP to it again. You just take one problem already known to be NP-hard — say SAT — and reduce
it to C. Since everything in NP reduces to SAT, and SAT
reduces to C, everything in NP reaches C by
transitivity. That single arrow carries the hardness of the entire class.
This is exactly how the family tree grew. SAT reduces to 3-SAT (every clause squeezed
to exactly three literals), and 3-SAT — clean and rigid — is the springboard to graph problems like
, finding a large
clique, or a Hamiltonian path. Reveal the arrows below one at a time:
Every arrow is a polynomial-time reduction pointing from an easier-to-reason-about source to a
target we want to prove hard. Follow any arrow forward and you carry hardness with you; the
whole diagram is rooted at SAT because Cook–Levin planted it there.
A reduction up close: 3-SAT ≤ₚ 3-Colouring
Here is the flavour of one real reduction, at the "gadget" level. We want to turn any 3-SAT formula
into a graph that is 3-colourable exactly when the formula is satisfiable. We build the
graph out of little reusable widgets called gadgets:
-
A palette triangle. Three special nodes T, F, B ("true",
"false", "base") joined in a triangle, so in any 3-colouring they must take three different colours —
this fixes what "true" and "false" mean as colours.
-
Variable gadgets. For each variable x, a pair of nodes
x and \lnot x joined to each other and to
B. They are forced to take the true and false colours in
some order — encoding a truth value.
-
Clause gadgets. A small OR-gadget wired to a clause's three literals that can be
3-coloured if and only if at least one of those literals got the "true" colour.
Bolt the gadgets together and the finished graph is 3-colourable precisely when there is a truth
assignment satisfying every clause. Crucially, the graph's size is polynomial in the formula's size and
it is built by a simple mechanical pass — so this genuinely is a polynomial-time reduction
\text{3-SAT} \le_p \text{3-Colouring}, and graph 3-colouring inherits
3-SAT's hardness.
Because a reduction is a strategic move, not a defeat. If you can't find a fast algorithm for
your problem, proving it NP-complete tells you nobody else will either (unless
\mathrm{P} = \mathrm{NP}) — so you can stop hunting for a perfect algorithm
and switch to approximations, heuristics, or special cases with a clear conscience. Reductions also
run the other way in practice: modern SAT solvers are so good that engineers reduce their
scheduling, verification and planning problems to SAT and let a battle-tested solver do the
work. The theory that says "these are all the same problem" becomes an engineering shortcut.
Reductions have a direction, and it is easy to get backwards. To prove your problem
B is NP-hard, you must reduce a known NP-hard
problem (like SAT) to B — that is
\text{SAT} \le_p B, source SAT, target B.
Doing it the other way, B \le_p \text{SAT}, proves the opposite:
it shows B is no harder than SAT (in fact, since SAT is in
NP, it just shows B \in \mathrm{NP}). A handy mantra: to show something is
hard, reduce hardness into it — point the arrow at the problem you're
accusing, from a villain you already know is guilty.