Angular Momentum Algebra
Here is one of the most astonishing magic tricks in all of physics. Write down three innocent-looking
equations — the commutation relations of angular momentum — and then never solve a
differential equation again. Out of pure algebra, with nothing but those three lines and the
demand that states have finite length, tumbles the entire angular-momentum spectrum: the shells of the
atom, the fact that electrons carry spin
\tfrac12, the selection rules that decide which spectral lines an atom is
allowed to emit. All of it. From three commutators.
This is a different style of quantum mechanics from solving the
Schrödinger equation
for a wavefunction. There is no potential, no boundary condition, no \psi(x)
to plot. We work only with operators and the rules they obey — an algebraic method
that Dirac perfected and that reaches parts of quantum mechanics differential equations cannot. Its
deepest payoff is spin: a kind of angular momentum with no position-space
wavefunction at all, invisible to the \hat{L} = \hat{r}\times\hat{p}
picture, yet forced into existence the moment you take the algebra seriously. Let us do exactly that.
The three commutators everything rests on
Classically, angular momentum is \vec{L} = \vec{r}\times\vec{p}, and its
three components are just numbers you can know all at once. Quantum mechanically the components become
operators, and promoting \vec{r} and \vec{p} to
operators with [\hat{x},\hat{p}_x]=i\hbar forces them to fail to
commute with one another. A short calculation gives the master relations:
[\hat{L}_x,\hat{L}_y]=i\hbar\,\hat{L}_z,\qquad
[\hat{L}_y,\hat{L}_z]=i\hbar\,\hat{L}_x,\qquad
[\hat{L}_z,\hat{L}_x]=i\hbar\,\hat{L}_y,
which the antisymmetric Levi-Civita symbol packs into a single compact line:
[\hat{L}_i,\hat{L}_j]=i\hbar\,\varepsilon_{ijk}\,\hat{L}_k.
Read that sentence physically and it is startling. Because
[\hat{L}_x,\hat{L}_y]\neq 0, the three components of angular momentum are
incompatible observables: you cannot simultaneously have a definite value of
L_x and L_y. Measuring one scrambles the others.
A quantum spinning object can never have its angular-momentum vector pointing in a fully definite
direction the way a classical gyroscope does — the very idea is forbidden by these three lines.
From here on we forget where these came from. We do not use
\vec{L}=\vec{r}\times\vec{p} again. We take the commutation relations
themselves as the definition of "an angular momentum," and see how much they alone determine.
(Spin obeys the identical algebra, with \hat{S} in place of
\hat{L} — which is precisely why the answer will include half-integers that
no orbital \vec{r}\times\vec{p} could ever produce.)
The Casimir operator, and what we can know at once
If we cannot know all three components together, what can we know? The trick is to build a
quantity that commutes with a component. Consider the total-magnitude-squared operator, the
Casimir operator
\hat{L}^2 = \hat{L}_x^2 + \hat{L}_y^2 + \hat{L}_z^2.
A few lines of commutator algebra (Example 1 below does a piece of it) show the remarkable fact that
\hat{L}^2 commutes with every component:
-
[\hat{L}^2,\hat{L}_x]=[\hat{L}^2,\hat{L}_y]=[\hat{L}^2,\hat{L}_z]=0.
The total angular momentum is compatible with any single component, even though the
components are mutually incompatible.
-
So we may simultaneously diagonalise \hat{L}^2 and one
component — by universal convention \hat{L}_z. A complete label for a
state is therefore the pair of quantum numbers in
|l,m\rangle.
Write the joint eigenstates as |l,m\rangle, with eigenvalues that we
parametrise (with foresight) as
\hat{L}^2\,|l,m\rangle = \hbar^2\,l(l+1)\,|l,m\rangle,\qquad
\hat{L}_z\,|l,m\rangle = \hbar\,m\,|l,m\rangle.
Writing the \hat{L}^2 eigenvalue as \hbar^2 l(l+1)
rather than, say, \hbar^2\lambda is not cheating — it is just a convenient
renaming, since any non-negative number can be written as l(l+1) for some
l\ge 0. The payoff is that the algebra below will hand us clean integers.
Everything we now do is aimed at answering one question: which values of
l and m are allowed?
Ladder operators: climbing the spectrum
The engine of the whole method is a clever pair of non-Hermitian combinations, the
ladder (or raising/lowering) operators:
\hat{L}_\pm = \hat{L}_x \pm i\,\hat{L}_y.
Grinding out their commutators from the three master relations (Example 1) gives the three facts we
will lean on for the rest of the page:
-
[\hat{L}_z,\hat{L}_\pm] = \pm\hbar\,\hat{L}_\pm — the key relation:
it makes \hat{L}_\pm shift the L_z value.
-
[\hat{L}^2,\hat{L}_\pm]=0 — the ladder moves you along a rung but
never changes the total magnitude l.
-
\hat{L}_\mp\hat{L}_\pm = \hat{L}^2 - \hat{L}_z^2 \mp \hbar\hat{L}_z —
the identity that will pin down the exact edge of the ladder and the normalisation.
Watch what the first relation does. Rearranged, it says
\hat{L}_z\hat{L}_\pm = \hat{L}_\pm(\hat{L}_z \pm \hbar). Apply this to a
state |l,m\rangle:
\hat{L}_z\big(\hat{L}_\pm|l,m\rangle\big)
= \hat{L}_\pm(\hat{L}_z \pm \hbar)|l,m\rangle
= \hbar(m\pm 1)\,\big(\hat{L}_\pm|l,m\rangle\big).
So \hat{L}_\pm|l,m\rangle is again an eigenstate of
\hat{L}_z, but with eigenvalue raised or lowered by one unit of
\hbar. Since [\hat{L}^2,\hat{L}_\pm]=0 leaves
l untouched, we have proved
\hat{L}_\pm\,|l,m\rangle \propto |l,\,m\pm 1\rangle.
The operators are exactly a ladder: \hat{L}_+ steps you up one rung in
m, \hat{L}_- steps you down one, both staying on
the same l. All that remains is to find where the ladder starts and stops.
The spectrum falls out — for free
Here is the crux, and it is pure logic. The expectation value of
\hat{L}_x^2 + \hat{L}_y^2 = \hat{L}^2 - \hat{L}_z^2 is a sum of squares of
Hermitian operators, so it can never be negative. Therefore, in any state
|l,m\rangle,
\hbar^2\,m^2 \;=\; \langle \hat{L}_z^2\rangle \;\le\; \langle \hat{L}^2\rangle
\;=\; \hbar^2\,l(l+1)\quad\Longrightarrow\quad m^2 \le l(l+1).
So for a fixed l, the value of m is
bounded above and below. But the ladder tries to raise m
without limit! The only escape is that the ladder must terminate: there is a
top rung m_{\max} that \hat{L}_+ annihilates, and
a bottom rung m_{\min} that \hat{L}_- annihilates,
\hat{L}_+\,|l,m_{\max}\rangle = 0,\qquad \hat{L}_-\,|l,m_{\min}\rangle = 0.
Feed the top condition into the identity
\hat{L}_-\hat{L}_+ = \hat{L}^2 - \hat{L}_z^2 - \hbar\hat{L}_z. Acting on
|l,m_{\max}\rangle the left side is zero, so
0 = l(l+1) - m_{\max}^2 - m_{\max}
\;\Longrightarrow\; m_{\max} = l.
The same move with \hat{L}_+\hat{L}_- on the bottom rung gives
m_{\min} = -l. And because the ladder connects top to bottom in
whole steps of one, the gap m_{\max}-m_{\min}=2l must be a
non-negative integer. That single integrality condition is the whole ballgame:
-
l may take only the values
l = 0,\ \tfrac12,\ 1,\ \tfrac32,\ 2,\dots — non-negative integers
and half-integers, because all that is required is that 2l be a
whole number.
-
For each l, the quantum number m runs in
unit steps m = -l,\,-l+1,\dots,\,l-1,\,l — exactly
2l+1 states on the rung.
Pause on how much came from how little. We never solved a differential equation, never chose a
potential, never even said what \hat{L} is. The integer
multiplets — l=1 giving three states, l=2 giving
five — are the atomic p, d, … subshells. And the half-integers, the
possibility the algebra grudgingly leaves open, are where spin lives:
l=\tfrac12 gives the two states, spin-up and spin-down, of an electron. The
differential-equation approach can only ever produce integer l; the algebra
knows about spin.
The exact rung-to-rung coefficient
"Proportional to" is not quite enough — we want the exact number. Take
\hat{L}_\pm|l,m\rangle = c^\pm_{l,m}\,|l,m\pm1\rangle and compute the norm of
both sides. Since \hat{L}_\mp = \hat{L}_\pm^\dagger, the squared length of
the left side is
\langle l,m|\hat{L}_\mp\hat{L}_\pm|l,m\rangle = \hbar^2\big[l(l+1)-m(m\pm1)\big],
again from that third identity. Matching to |c^\pm_{l,m}|^2 gives the result
you will use constantly:
\hat{L}_\pm\,|l,m\rangle = \hbar\,\sqrt{\,l(l+1) - m(m\pm 1)\,}\;|l,m\pm 1\rangle.
This formula is the whole spectrum in one line. Notice its built-in safety catch: at the top rung
m=l, the raising coefficient is
\sqrt{l(l+1)-l(l+1)}=0, so \hat{L}_+ kills the
state exactly as required — the ladder cannot climb past the top even though you keep pushing. At the
bottom rung m=-l the lowering coefficient likewise vanishes. The chart below
plots both coefficients across the rungs: watch them fall smoothly to zero precisely at the two ends of
the multiplet, sealing the ladder shut.
The largest possible L_z is \hbar l (the top
rung), yet the total magnitude is |\vec{L}| = \hbar\sqrt{l(l+1)}, which is
strictly larger than \hbar l. The vector can never lie flat along
the z-axis: there is always a leftover
\hbar^2\,l of L_x^2+L_y^2 it cannot get rid of.
Angular momentum in quantum mechanics is forever tilted — the "Watch out!" box below makes this precise.
Worked examples
Example 1 — verify [\hat{L}_z,\hat{L}_+]=\hbar\hat{L}_+ from the
base commutators. Expand \hat{L}_+ = \hat{L}_x + i\hat{L}_y and use
linearity of the commutator:
[\hat{L}_z,\hat{L}_+] = [\hat{L}_z,\hat{L}_x] + i\,[\hat{L}_z,\hat{L}_y]
= i\hbar\hat{L}_y + i(-i\hbar\hat{L}_x)
= \hbar\hat{L}_x + i\hbar\hat{L}_y = \hbar\,\hat{L}_+.
We used [\hat{L}_z,\hat{L}_x]=i\hbar\hat{L}_y and
[\hat{L}_z,\hat{L}_y]=-i\hbar\hat{L}_x (the third master relation, cyclic).
The \hat{L}_- case runs identically and gives
-\hbar\hat{L}_- — hence the compact
[\hat{L}_z,\hat{L}_\pm]=\pm\hbar\hat{L}_\pm. Every result on this page is
built from moves exactly this size.
Example 2 — a ladder coefficient. An electron's spin has
l=\tfrac32 (say a j=\tfrac32 state). Raise from
m=-\tfrac12. The coefficient is
\hbar\sqrt{\tfrac32\cdot\tfrac52 - \big(-\tfrac12\big)\big(\tfrac12\big)}
= \hbar\sqrt{\tfrac{15}{4} + \tfrac14} = \hbar\sqrt{4} = 2\hbar,
so \hat{L}_+|\tfrac32,-\tfrac12\rangle = 2\hbar\,|\tfrac32,\tfrac12\rangle.
Try the top rung as a check: raising from m=\tfrac32 gives
\sqrt{\tfrac{15}{4}-\tfrac32\cdot\tfrac52}=\sqrt{0}=0 — the ladder is
capped, as it must be.
Example 3 — count the states. A d-electron has orbital
l=2. The allowed m are
-2,-1,0,1,2, which is 2l+1 = 5 states — the five
d-orbitals of chemistry. Its total magnitude is
|\vec{L}|=\hbar\sqrt{2\cdot 3}=\hbar\sqrt6\approx 2.449\,\hbar, larger than the
maximum projection 2\hbar — the vector stays tilted, never flat along
z.
The algebra offers l=\tfrac12,\tfrac32,\dots as a formal possibility, but
does nature use them? Emphatically yes — that is exactly what spin is. An electron
genuinely carries s=\tfrac12, with just two rungs
m_s=\pm\tfrac12, and the Stern–Gerlach experiment splits a beam into
precisely those two. What half-integer spin is not is a tiny ball physically spinning: try to
account for an electron's angular momentum with a rotating charged sphere of the electron's size and
the surface would have to move faster than light. Spin is an intrinsic angular
momentum with no spatial rotation behind it — an internal property as fundamental as charge or mass.
The tell-tale fingerprint of half-integer angular momentum is that a full
360^\circ rotation multiplies the state by
-1, not +1; you must turn a spin-\tfrac12
object twice around, 720^\circ, to bring it back to itself. That is
not a quirk of a particular particle — it is the algebra of these three commutators, made visible.
Trap one: an unbounded ladder. It is tempting to think that since
\hat{L}_+ raises m, you can keep raising forever.
You cannot. The bound m^2\le l(l+1) caps
m, and the raising coefficient
\hbar\sqrt{l(l+1)-m(m+1)} hits exactly zero at
m=l. Applying \hat{L}_+ to the top state does not
give a bigger state — it gives the zero vector. The ladder has a real top and a real
bottom, and the algebra enforces both.
Trap two: |\vec{L}| = \hbar l. The magnitude of the angular
momentum is |\vec{L}| = \hbar\sqrt{l(l+1)}, not
\hbar l. For l=1 that is
\hbar\sqrt2\approx1.414\hbar, strictly more than the largest projection
L_z=\hbar. The consequence is physical, not pedantic: since
\sqrt{l(l+1)}>l always, the vector can never point exactly
along z — doing so would need
L_x=L_y=0 with L_z=|\vec{L}|, but then you would
know all three components at once, violating
[\hat{L}_x,\hat{L}_y]=i\hbar\hat{L}_z. Quantum angular momentum is always
tilted, with an irreducible L_x^2+L_y^2 smeared around the cone.
Bonus trap: \hat{L}_\pm is an observable. It is not.
\hat{L}_+ = \hat{L}_x+i\hat{L}_y is not Hermitian
(\hat{L}_+^\dagger=\hat{L}_-\ne\hat{L}_+), so it corresponds to no
measurement and has no real eigenvalues. It is a bookkeeping tool that moves between physical states —
useful precisely because it is not an observable.