The HHL Algorithm
Imagine a linear system so vast — a
system of
equations A\vec{x} = \vec{b} with a trillion
unknowns — that no classical computer could ever write down the answer vector
\vec{x}, let alone read out its trillion entries. Now suppose you don't
actually want all trillion numbers. You want one number: the total load on a
single power line, the average of a stress field, one weighted sum over the solution. The
Harrow–Hassidim–Lloyd (HHL) algorithm is built for exactly this. It produces the
solution as a quantum state |x\rangle in time that grows like
\log N instead of N — an
exponential saving in the dimension. But that "as a quantum state" is doing an
enormous amount of work, and the whole lesson is really about the fine print.
The plan: invert A one eigenvalue at a time
Assume A is Hermitian (if it isn't, a standard trick embeds it in a bigger
Hermitian matrix). Then A has real
eigenvalues
\lambda_j with orthonormal eigenvectors
|u_j\rangle. Encode the right-hand side \vec{b} as
a quantum state |b\rangle (its amplitudes are the entries of
\vec{b}, normalised) and expand it in that eigenbasis:
|b\rangle = \sum_j \beta_j\,|u_j\rangle.
The exact solution is \vec{x} = A^{-1}\vec{b}. Because
A^{-1}|u_j\rangle = \tfrac{1}{\lambda_j}|u_j\rangle, inverting
A just means dividing each eigen-component by its eigenvalue:
|x\rangle \;\propto\; A^{-1}|b\rangle \;=\; \sum_j \frac{\beta_j}{\lambda_j}\,|u_j\rangle.
So the entire job of HHL is to turn each coefficient \beta_j into
\beta_j/\lambda_j. The difficulty is that a quantum computer doesn't
see the eigenvalues — it has to measure them, in superposition, without
collapsing the state. That is precisely what phase estimation is for.
Three moves: estimate, rotate, uncompute
HHL threads the eigen-decomposition through three quantum sub-steps:
-
Phase estimation. Run
quantum
phase estimation on the unitary e^{iAt}. Since
e^{iAt}|u_j\rangle = e^{i\lambda_j t}|u_j\rangle, the eigenvector
|u_j\rangle carries the eigenvalue \lambda_j in
its phase, and QPE writes an estimate of \lambda_j into an auxiliary
register: \sum_j \beta_j\,|u_j\rangle|\lambda_j\rangle.
-
Controlled rotation. Conditioned on the value
\lambda_j now sitting in that register, rotate an extra ancilla qubit so
its amplitude becomes proportional to 1/\lambda_j. Measuring the ancilla
and post-selecting on the "success" outcome multiplies each component by exactly
C/\lambda_j — the 1/\lambda we wanted.
-
Inverse phase estimation. Run QPE backwards to erase the
|\lambda_j\rangle register, disentangling it so it doesn't spoil the
answer. What's left on the main register is
|x\rangle \propto \sum_j (\beta_j/\lambda_j)|u_j\rangle = A^{-1}|b\rangle.
The output is the solution — but living inside the amplitudes of a quantum state, not printed on a
tape.
The pipeline, end to end
Step through the whole assembly line. Each block is a reversible quantum operation; only the very last
stage — pulling a number back out — is a measurement.
Worked example 1 — the eigen-decomposition step
Let's make the abstract sum concrete on paper first. Suppose A has just two
eigenpairs, (\lambda_1, |u_1\rangle) and
(\lambda_2, |u_2\rangle), and the right-hand side decomposes as
|b\rangle = \beta_1\,|u_1\rangle + \beta_2\,|u_2\rangle.
HHL leaves the eigenvectors untouched — they are the natural axes of
A — and only reweights the coefficients. Phase estimation exposes
\lambda_1 and \lambda_2; the controlled rotation
divides by them; the inverse QPE tidies up. The state that emerges is
|x\rangle \;\propto\; \frac{\beta_1}{\lambda_1}\,|u_1\rangle + \frac{\beta_2}{\lambda_2}\,|u_2\rangle.
Read that off: a component along an eigenvector with a small eigenvalue gets
amplified (dividing by a small number), while a component with a large eigenvalue gets
suppressed. That is exactly what A^{-1} does — and it is
also the seed of the trouble with tiny eigenvalues we'll meet in the caveats.
Worked example 2 — a 2×2 diagonal A, with numbers
Take the cleanest possible case, a diagonal matrix whose eigenvectors are just the computational basis
states |0\rangle and |1\rangle:
A = \begin{bmatrix} 2 & 0 \\ 0 & 4 \end{bmatrix}, \qquad \vec{b} = \begin{bmatrix} 1 \\ 1 \end{bmatrix} \;\Rightarrow\; |b\rangle = \tfrac{1}{\sqrt2}\big(|0\rangle + |1\rangle\big).
The eigenvalues are \lambda_0 = 2 (eigenvector
|0\rangle) and \lambda_1 = 4 (eigenvector
|1\rangle), read straight off the diagonal. The
1/\lambda rotation multiplies the |0\rangle
amplitude by \tfrac12 and the |1\rangle amplitude
by \tfrac14:
A^{-1}|b\rangle = \tfrac{1}{\sqrt2}\Big(\tfrac{1}{2}|0\rangle + \tfrac{1}{4}|1\rangle\Big).
Dropping the overall constant and renormalising, the output state is
|x\rangle = \frac{\tfrac12|0\rangle + \tfrac14|1\rangle}{\sqrt{(\tfrac12)^2 + (\tfrac14)^2}} = \frac{2|0\rangle + |1\rangle}{\sqrt5}.
Check it classically: \vec{x} = A^{-1}\vec{b} = (\tfrac12, \tfrac14), whose
direction is indeed (2, 1) after clearing fractions — the same state HHL
prepares. Notice the honest limitation already: HHL hands you the normalised direction
(2,1)/\sqrt5, not the raw entries (\tfrac12, \tfrac14).
The overall scale lives in that discarded normalisation constant.
-
Problem. Given a sparse Hermitian A and a state
|b\rangle, produce the solution of A\vec{x} = \vec{b}
as a quantum state |x\rangle \propto A^{-1}|b\rangle.
-
Method. Write |b\rangle = \sum_j \beta_j|u_j\rangle; use
phase estimation on e^{iAt} to expose each
\lambda_j; apply a controlled rotation that scales each
component by 1/\lambda_j; then uncompute the eigenvalue
register with inverse phase estimation.
-
Runtime. Roughly
O\!\big(\log(N)\,s^2\,\kappa^2/\varepsilon\big) — exponentially better in
the dimension N than classical O(N), where
s is the sparsity, \kappa the condition number,
and \varepsilon the accuracy.
-
Catch. The output is a state, extractable only as summary statistics
\langle x|M|x\rangle; the speedup is conditional on cheap
|b\rangle preparation, sparse and well-conditioned
A, and never needing the individual entries of
\vec{x}.
Where the exponential speedup lives — and where it hides
Solving a dense N-variable system classically costs on the order of
N^3; even the best sparse iterative solvers still touch all
N entries, costing O(N s \kappa). HHL's
\log N dependence looks like an exponential leap. But the
runtime O(\log(N)\,s^2\,\kappa^2/\varepsilon) also carries a
\kappa^2 and a 1/\varepsilon that are easy to
gloss over. If the condition number \kappa grows with the problem size, or
you demand high precision, those factors can quietly swallow the \log N win.
The exponential advantage is real, but it is an advantage in one variable
(N) bought under a stack of assumptions about all the others.
This is the single most over-hyped point in all of quantum algorithms, so state it plainly. HHL does
not give you \vec{x}. It gives you
|x\rangle, a quantum state whose amplitudes encode
\vec{x}. To learn one amplitude you'd have to measure, and a single
measurement returns one random basis outcome, not a number — reconstructing all
N entries would take \sim N repetitions and
throw the entire exponential speedup away. What you can cheaply extract is a
summary statistic \langle x|M|x\rangle: an expectation, a norm, an overlap
with some fixed vector. HHL is only a win when the question is one of those numbers.
And it only works at all when three further conditions hold: (i) you can prepare
|b\rangle efficiently — if loading \vec{b} already
costs O(N), you've lost before you start; (ii) A
is sparse and you can simulate e^{iAt}; and (iii)
A is well-conditioned — a small
\kappa = \lambda_{\max}/\lambda_{\min}. A tiny eigenvalue means dividing by a
tiny number, which both blows up the runtime (the \kappa^2) and makes the
controlled rotation succeed only rarely. Miss any of these and the algorithm degrades from magic to
useless.
In 2018, then-undergraduate Ewin Tang did something the field half-expected to be
impossible: she took a celebrated quantum recommendation-system algorithm — a close cousin of HHL —
and produced a classical algorithm only polynomially slower, demolishing its claimed
exponential advantage. The trick, now called "dequantization," is that quantum
algorithms like HHL assume you can cheaply sample from and query the input; if a classical
algorithm is granted the same sampling access (so-called ℓ²-norm sampling), it can
often mimic the quantum output using randomised linear algebra.
A wave of dequantization results followed, matching several HHL-type speedups classically under the
same assumptions. The lesson isn't that HHL is worthless — for genuinely sparse, well-conditioned,
high-rank systems the quantum advantage may survive — but that a fair comparison must give the
classical algorithm the same input access the quantum one enjoys. Many advertised "exponential
speedups" shrink to polynomial once you do.
Given all these caveats, why is HHL one of the most cited quantum algorithms ever? Because it is rarely
used alone. It is a subroutine: a fast way to apply
A^{-1} to a state in the middle of a larger quantum computation whose final
answer is naturally a single number. That description fits a great deal of
quantum
machine learning — least-squares fitting, support vector machines, Gaussian processes,
recommendation systems — where you fit a model to data (a linear solve) and then only ever query it
for a prediction (an inner product), never dumping out every parameter. Whenever the pipeline is "solve
a huge linear system, then ask it one question," HHL is the piece that makes the solve cheap. That is
the shape of problem it was built for — and the shape where its exponential promise has the best chance
of surviving contact with reality.