Time-Independent Perturbation Theory
Here is an awkward secret of quantum mechanics: almost no real Hamiltonian can be solved exactly. The
Schrödinger equation
yields clean answers for a handful of idealised systems — a particle in a box, a harmonic spring, a
lone hydrogen atom — and then reality intrudes. Put that hydrogen atom in a weak electric field, add a
magnetic field, let the spring stiffen slightly at large stretch, and the tidy eigenvalue equation
stops having a closed-form solution. Nature is almost never one of the solvable problems.
But it is very often a solvable problem plus a small nudge. The atom in the field is just the
free atom with a little extra energy; the stiffening spring is the perfect oscillator with a small
correction. Perturbation theory is the art of trading a problem you cannot solve for
one you can, plus a power series of corrections you can compute by hand. It turns "unsolvable" into
"solvable to as many decimal places as you have patience for" — and it is the single most-used
approximation method in all of quantum physics.
The setup: a solvable Hamiltonian plus a small term
Split the Hamiltonian into a part you have already diagonalised and a small extra piece:
\hat{H} = \hat{H}_0 + \lambda \hat{H}',
where \hat{H}_0 is the unperturbed Hamiltonian whose
problem is completely solved,
\hat{H}_0\,|n^{(0)}\rangle = E_n^{(0)}\,|n^{(0)}\rangle,
and \lambda\hat{H}' is the small perturbation. The
bookkeeping parameter \lambda (between 0 and 1) tracks the "size" of the
nudge: at \lambda = 0 we have the solved problem back, and at
\lambda = 1 we have the full one. The whole method rests on one hopeful but
usually reasonable assumption: that the true energies and states are smooth power series
in \lambda, so we can expand
E_n = E_n^{(0)} + \lambda E_n^{(1)} + \lambda^2 E_n^{(2)} + \cdots,
\qquad
|n\rangle = |n^{(0)}\rangle + \lambda\,|n^{(1)}\rangle + \lambda^2\,|n^{(2)}\rangle + \cdots.
The superscripts count the order: E_n^{(1)} is the first-order
energy correction, E_n^{(2)} the second-order, and so on. Substitute these
series into \hat{H}|n\rangle = E_n|n\rangle and collect terms with equal
powers of \lambda — each power gives one equation, and out drop the
corrections one order at a time. The result is a small toolkit of formulas, which we now assemble. This
is the Rayleigh–Schrödinger scheme, and for now we assume the unperturbed levels are
non-degenerate (each energy has one state); the degenerate case gets a warning of its
own below.
First order: the energy correction is just an average
The very first correction is the one you will use most, and it is astonishingly cheap to compute.
Collecting the terms linear in \lambda and projecting onto
\langle n^{(0)}| gives the headline result of the entire subject:
E_n^{(1)} = \langle n^{(0)}|\,\hat{H}'\,|n^{(0)}\rangle.
Read that in words: the first-order shift of a level is simply the expectation value of the
perturbation in the unperturbed state. You do not have to solve any new eigenvalue problem, or
find any new wavefunction — you already have |n^{(0)}\rangle, so you just
sandwich \hat{H}' between it and its conjugate and do an integral. For the
cost of one expectation value you get the leading correction to the energy. This is why perturbation
theory is a workhorse: the hardest-looking corrections often reduce to an average you can do on the
back of an envelope.
The state also shifts. Projecting the same first-order equation onto a different unperturbed
state \langle m^{(0)}| (with m \neq n) and solving
gives the first-order correction to the ket as a weighted sum over all the other states:
|n^{(1)}\rangle = \sum_{m \neq n}
\frac{\langle m^{(0)}|\,\hat{H}'\,|n^{(0)}\rangle}{E_n^{(0)} - E_m^{(0)}}\;|m^{(0)}\rangle.
Each other state |m^{(0)}\rangle gets mixed in by an amount set by two
things: how strongly the perturbation connects it to our state (the matrix element
\langle m^{(0)}|\hat{H}'|n^{(0)}\rangle in the numerator), and how
close it is in energy (the gap E_n^{(0)} - E_m^{(0)} in the
denominator). Nearby, strongly-coupled states mix in a lot; distant, weakly-coupled ones barely
register. Notice the sum pointedly excludes m = n — the
term that would divide by zero — and that exclusion is the source of a classic mistake we flag later.
Second order: states repel
First order is often enough, but sometimes it vanishes (if the perturbation has no diagonal part,
E_n^{(1)} = 0) and you must go further. Continuing to the terms in
\lambda^2 gives the second-order energy correction:
E_n^{(2)} = \sum_{m \neq n}
\frac{\bigl|\langle m^{(0)}|\,\hat{H}'\,|n^{(0)}\rangle\bigr|^2}{E_n^{(0)} - E_m^{(0)}}.
Every numerator here is a squared magnitude, so it is never negative. That single
fact has a beautiful consequence for the ground state. For the lowest level, every
other state lies above it, so every denominator
E_0^{(0)} - E_m^{(0)} is negative — which makes every term negative, and
therefore E_0^{(2)} \leq 0 always. The second-order shift of the
ground state is guaranteed to push it down.
There is a memorable way to say this: levels repel. When two states are coupled by a
perturbation, second order pushes the lower one down and the upper one up, as if they were shoving each
other apart. The closer they start (smaller gap) and the more strongly they couple (bigger matrix
element), the harder they shove. You will see exactly this repulsion in the interactive figure below,
where the true lower level bends away from a coupled partner.
-
First-order energy:
E_n^{(1)} = \langle n^{(0)}|\hat{H}'|n^{(0)}\rangle — the expectation
value of the perturbation in the unperturbed state.
-
First-order state:
|n^{(1)}\rangle = \sum_{m \neq n}
\frac{\langle m^{(0)}|\hat{H}'|n^{(0)}\rangle}{E_n^{(0)} - E_m^{(0)}}\,|m^{(0)}\rangle.
-
Second-order energy:
E_n^{(2)} = \sum_{m \neq n}
\frac{\bigl|\langle m^{(0)}|\hat{H}'|n^{(0)}\rangle\bigr|^2}{E_n^{(0)} - E_m^{(0)}},
which is negative for the ground state (all denominators negative).
When is a perturbation actually "small"?
Every formula above has an energy gap in a denominator, and that is where the method lives or dies. For
the series to converge — for each correction to be smaller than the last — the perturbation must be
weak compared with the spacing of the levels:
\bigl|\langle m^{(0)}|\hat{H}'|n^{(0)}\rangle\bigr| \;\ll\;
\bigl|E_n^{(0)} - E_m^{(0)}\bigr|
\qquad\text{for every } m \neq n.
"Small" is not an absolute statement about the perturbation — it is a comparison. A perturbation of a
given strength can be perfectly negligible for well-separated levels and catastrophically large for
levels that nearly touch. What matters is the dimensionless ratio of coupling to gap. When it is tiny,
two or three orders of perturbation theory give you results good to many decimal places; when it
approaches 1, the series limps or diverges and you need a different tool.
The most dramatic failure is degeneracy. If two unperturbed states share the same
energy, E_n^{(0)} = E_m^{(0)}, the gap is zero and the formulas
divide by zero — a nonsense infinity. The cure (developed on its own page) is
degenerate perturbation theory: within the degenerate subspace you must first
diagonalise \hat{H}' itself, choosing the "good" combinations of
states that the perturbation does not mix, and only then apply the non-degenerate formulas to what
remains. Keep that signpost in mind; the machinery on this page assumes every level stands
alone.
Watch the approximation hug — then peel away
There is one problem simple enough to solve both exactly and by perturbation theory, so you
can lay them side by side: a two-level system. Take
\hat{H} = \begin{pmatrix} E_1 & \lambda V \\ \lambda V & E_2 \end{pmatrix}
= \underbrace{\begin{pmatrix} E_1 & 0 \\ 0 & E_2 \end{pmatrix}}_{\hat{H}_0}
+ \lambda\underbrace{\begin{pmatrix} 0 & V \\ V & 0 \end{pmatrix}}_{\hat{H}'}.
The unperturbed levels are just E_1 and E_2, and
the perturbation is a pure off-diagonal coupling V. Because
\hat{H}' has no diagonal part, the first-order shift vanishes,
E_1^{(1)} = \langle 1|\hat{H}'|1\rangle = 0 — so the perturbative estimate
of the lower level is flat to first order. The interesting physics is all at second order:
E_1^{(2)} = \frac{|\langle 2|\hat{H}'|1\rangle|^2}{E_1 - E_2}
= \frac{V^2}{E_1 - E_2}
\;\;\Rightarrow\;\;
E_1 \approx E_1^{(0)} - \frac{(\lambda V)^2}{E_2 - E_1}.
Meanwhile this 2\times 2 can be diagonalised exactly; the lower eigenvalue
is
E_- = \frac{E_1 + E_2}{2}
- \sqrt{\left(\frac{E_1 - E_2}{2}\right)^{2} + (\lambda V)^2}.
Now compare the three curves below as you turn up the coupling \lambda along
the horizontal axis. The flat line is the first-order estimate (no shift). The downward parabola is the
second-order estimate. And the smooth curve is the exact truth. For small
\lambda the parabola hugs the exact curve beautifully —
that is perturbation theory working. As \lambda grows, the exact curve
levels off (a square root flattens) while the parabola keeps diving, and the two
peel apart. That divergence is the series breaking down, and you can trigger it sooner
by shrinking the gap \Delta or raising the coupling
V with the sliders — exactly the validity condition made visible.
This one picture is the whole method in miniature: a cheap approximation that is excellent while the
perturbation is small and honest about failing when it is not.
Where this shows up: Stark, Zeeman, and a stubborn spring
Perturbation theory is not a toy — it is how much of atomic physics is actually computed. Three
canonical examples:
-
The Stark effect. Place an atom in a weak, uniform electric field
\mathcal{E}. The extra energy is
\hat{H}' = e\mathcal{E}\,\hat{z}, and perturbation theory predicts how the
atomic levels shift and split with the field. For hydrogen's ground state the first-order shift
vanishes by symmetry and the leading effect is second-order, quadratic in
\mathcal{E} — the field polarises the atom.
-
The Zeeman effect. Place the atom in a weak magnetic field
\mathbf{B} instead. Now
\hat{H}' \propto \mathbf{B}\cdot(\hat{\mathbf{L}} + 2\hat{\mathbf{S}}), and
the first-order shift \langle n|\hat{H}'|n\rangle splits each spectral
line into a fan of components spaced in proportion to B — the splitting
seen in every magnetic-resonance spectrum.
-
The anharmonic oscillator. A real spring is not perfectly harmonic; at larger
amplitude it stiffens. Model the extra stiffness as
\hat{H}' = \lambda \hat{x}^4 added to the harmonic oscillator. First order
gives the leading energy shift of each level as
\lambda\langle n|\hat{x}^4|n\rangle, a matrix element you can compute with
ladder operators — the perturbative correction to a vibrating molecule's spectrum.
In each case the recipe is identical: identify the solvable \hat{H}_0, write
the nudge as \hat{H}', and turn the crank. The physics — polarisation,
line-splitting, anharmonicity — falls out as an expectation value.
Worked examples
Example 1 — a constant push in a box. Take the infinite square well and add a small
constant potential \hat{H}' = V_0 everywhere inside (a uniform energy offset).
The first-order shift of any level is
E_n^{(1)} = \langle n^{(0)}|V_0|n^{(0)}\rangle
= V_0\,\langle n^{(0)}|n^{(0)}\rangle = V_0,
using normalisation \langle n^{(0)}|n^{(0)}\rangle = 1. Every level simply
rises by V_0 — which is exactly right, since adding a constant to the
Hamiltonian shifts all energies by that constant and changes no wavefunction. Perturbation theory
reproduces the obvious answer, a good sanity check.
Example 2 — second order in the two-level system. For the coupled pair
\hat{H}_0 = \mathrm{diag}(E_1, E_2) with off-diagonal coupling
V, the only "other" state for level 1 is level 2, so the second-order sum has
a single term:
E_1^{(2)} = \frac{|\langle 2|\hat{H}'|1\rangle|^2}{E_1 - E_2}
= \frac{V^2}{E_1 - E_2} = -\,\frac{V^2}{E_2 - E_1}.
With E_2 > E_1 this is negative — the lower level is pushed down,
the repulsion promised above. Expand the exact
E_- for small V using
\sqrt{a^2 + \epsilon} \approx a + \epsilon/2a and you recover precisely this
term — perturbation theory is the Taylor expansion of the exact answer.
Example 3 — reading a validity ratio. Suppose two levels sit at
E_1 = 2 and E_2 = 10 (gap
\Delta = 8) and are coupled by a matrix element of size
|V| = 1. The smallness ratio is
|V|/\Delta = 1/8 = 0.125 — comfortably below 1, so two orders of perturbation
theory will be excellent. Now push the upper level down to
E_2 = 3: the gap collapses to 1, the ratio becomes
1, and the series is in serious trouble even though the coupling
V never changed. Smallness is always relative to the gap.
Suppose two unperturbed states |a\rangle and
|b\rangle share the same energy,
E_a^{(0)} = E_b^{(0)}. Try the state-correction formula and you hit a term
with denominator E_a^{(0)} - E_b^{(0)} = 0 — division by zero, an infinite
"correction." The formula has not just become inaccurate; it has become meaningless.
The deeper reason is that when levels are degenerate, the unperturbed problem doesn't tell you
which basis to use — any combination of |a\rangle and
|b\rangle is an equally good eigenstate of
\hat{H}_0. The perturbation breaks the tie. Degenerate perturbation
theory handles this by first building the little matrix of
\hat{H}' within the degenerate subspace and diagonalising it; its
eigenvectors are the "good" states the perturbation actually picks out, and its eigenvalues are the
first-order shifts. Only after choosing that basis do the ordinary formulas make sense again. A zero in
a denominator is nature's way of telling you to stop and diagonalise first.
The classic trap. Students write the first-order state correction
|n^{(1)}\rangle = \sum_{m \neq n}
\frac{\langle m^{(0)}|\hat{H}'|n^{(0)}\rangle}{E_n^{(0)} - E_m^{(0)}}\,|m^{(0)}\rangle
and quietly let the sum include m = n — after all, isn't
|n^{(0)}\rangle part of the answer? It is not in this sum. The
m = n term would put the vanishing gap
E_n^{(0)} - E_n^{(0)} = 0 in the denominator — an immediate blow-up. The sum
runs strictly over m \neq n.
Where did the missing |n^{(0)}\rangle component go? It is fixed by
normalisation, not by this formula. The standard convention (intermediate
normalisation) sets \langle n^{(0)}|n^{(1)}\rangle = 0, so the correction
has no component along the original state at first order — the perturbation tilts the state
toward its neighbours, and how much of the original survives is settled separately by keeping the total
length equal to one. Never divide by a zero gap; the excluded term is excluded for a reason.