Operators, Observables and Expectation Values
Ask a classical physicist "what is the energy of this system?" and the question has a plain answer: a
number, sitting there waiting to be read off. Quantum mechanics refuses to play along. In the quantum
world a physical quantity you can measure — position, momentum, energy, spin — is not a number
attached to the system at all. It is a question you put to the system, and the machinery
that turns that question into possible answers is a mathematical object called an operator.
This page is about the bridge between the abstract wavefunction and the numbers a laboratory dial
actually shows. Three ideas do all the work, and they lock together tightly:
every measurable quantity (an observable) is represented by a special kind of operator;
the numbers a single measurement can possibly return are that operator's eigenvalues;
and when the answer is uncertain, the long-run average of many identical measurements is the
expectation value. Get these three straight and the strange bookkeeping of quantum
measurement — probabilities, collapse, "why is the average not a possible answer?" — stops being
mysterious and starts being arithmetic.
Observables are operators
An operator is just a rule that eats a function and hands back another function. In
quantum mechanics each observable comes with its own operator that acts on the
wavefunction
\psi(x). The three you meet first are:
-
Position \hat{x} — the simplest of all: it just
multiplies the wavefunction by x,
\hat{x}\,\psi(x) = x\,\psi(x).
-
Momentum \hat{p} — this one differentiates:
\hat{p} = -i\hbar\,\dfrac{d}{dx}. The factor of
i is not decoration; it is exactly what is needed to make the operator
well behaved (Hermitian), as we will see.
-
Energy \hat{H} — the Hamiltonian, built
from the other two just as classical energy is kinetic plus potential:
\hat{H} = \dfrac{\hat{p}^2}{2m} + V(\hat{x}) = -\dfrac{\hbar^2}{2m}\dfrac{d^2}{dx^2} + V(x).
These operators are not arbitrary. Every operator standing for a real physical observable must be
Hermitian (self-adjoint) — meaning, loosely, that it equals its own conjugate
transpose, \hat{A}^\dagger = \hat{A}. That single algebraic condition turns
out to guarantee exactly the property a measurable quantity must have: the numbers it can produce are
always real. A dial never reads 3 + 2i volts.
What a measurement can return: eigenvalues
Occasionally an operator acts on a special function and gives that same function straight back,
merely scaled by a number. Such a function is an eigenstate (or eigenfunction), and
the scaling number is its eigenvalue:
\hat{A}\,\psi_n = a_n\,\psi_n .
The a_n are the heart of the whole theory, because of the
measurement postulate: the only possible outcomes of measuring the observable
A are the eigenvalues of \hat{A} — nothing in
between, ever. If the system happens to be in the eigenstate
\psi_n, you are guaranteed to get a_n, dead
certain, every time. If it is in some other state you will get one of the
a_n, but which one is a matter of probability.
-
Only eigenvalues appear. A measurement of the observable
A can return only an eigenvalue a_n of its
operator \hat{A} — the allowed "reading" of the dial.
-
Hermiticity makes them real. Because \hat{A} is
Hermitian, every eigenvalue a_n is a real number — as any physical
measurement must be.
-
Eigenstates give certainty. If the system is in an eigenstate
\psi_n, the result is a_n with probability
1; otherwise the result is genuinely probabilistic.
-
Measurement collapses the state. Immediately after a measurement that yielded
a_n, the wavefunction has jumped to (collapsed onto) the eigenstate
\psi_n — measure again at once and you get a_n
for sure.
Let us check an eigenvalue by hand. Take the momentum operator
\hat{p} = -i\hbar\,\dfrac{d}{dx} and feed it a plane wave
\psi(x) = e^{ikx}:
\hat{p}\,e^{ikx} = -i\hbar\,\frac{d}{dx}\,e^{ikx} = -i\hbar\,(ik)\,e^{ikx} = \hbar k\,e^{ikx}.
Out came the same function times \hbar k, so e^{ikx}
is a momentum eigenstate with the real eigenvalue p = \hbar k — the de
Broglie momentum. (Notice how the -i and the i from
the derivative multiplied to give +1: that is the i
earning its keep, making the eigenvalue real rather than imaginary.)
The expectation value: a weighted average of outcomes
Suppose the state is not an eigenstate, so a single measurement is a gamble among the
eigenvalues. Repeat the experiment on a huge number of identically-prepared copies and average the
readings. That average is the expectation value \langle A \rangle.
It is a probability-weighted mean, exactly like the expected value of a die roll:
\langle A \rangle = \sum_n p_n\, a_n , \qquad p_n = |c_n|^2 .
Here c_n is the amplitude of the eigenstate \psi_n
in the expansion \psi = \sum_n c_n \psi_n, and the
Born rule says the
probability of getting a_n is p_n = |c_n|^2. There
is an equivalent, all-in-one formula that never mentions the individual outcomes — the "sandwich":
\langle A \rangle = \langle \psi | \hat{A} | \psi \rangle = \int_{-\infty}^{\infty} \psi^*(x)\,\hat{A}\,\psi(x)\, dx .
The two expressions are the same number written two ways. Below, an observable has just two possible
outcomes, a_1 = 2 and a_2 = 6. Drag the slider to
change the probability p of getting a_1 (so
a_2 has probability 1-p) and watch the expectation
value \langle A \rangle = p\,a_1 + (1-p)\,a_2 slide smoothly between the two
rungs — landing on an eigenvalue only at the very ends.
Worked examples
Example 1 — an energy measurement as a weighted average. A particle is prepared so
that a measurement of its energy returns E_1 = 1\ \text{eV} with probability
p_1 = \tfrac14 and E_2 = 3\ \text{eV} with probability
p_2 = \tfrac34. What is \langle E \rangle?
\langle E \rangle = p_1 E_1 + p_2 E_2 = \tfrac14(1) + \tfrac34(3) = 0.25 + 2.25 = 2.5\ \text{eV}.
The average is 2.5\ \text{eV} — and here is the punchline that trips people
up: a single measurement can never give 2.5\ \text{eV}. Every
actual reading is either 1 or 3; the observable has
no eigenvalue at 2.5. The expectation value is a statement about the whole
ensemble, not about any one experiment.
Example 2 — from amplitudes to the average. A state is written as a superposition of
three energy eigenstates,
\psi = \sqrt{\tfrac12}\,\psi_1 + \sqrt{\tfrac13}\,\psi_2 + \sqrt{\tfrac16}\,\psi_3,
with eigenvalues a_1 = 2,\ a_2 = 5,\ a_3 = 11. First turn amplitudes into
probabilities with the Born rule, p_n = |c_n|^2:
p_1 = \tfrac12, \qquad p_2 = \tfrac13, \qquad p_3 = \tfrac16 \qquad (\text{check: } \tfrac12+\tfrac13+\tfrac16 = 1).
Then average:
\langle A \rangle = \tfrac12(2) + \tfrac13(5) + \tfrac16(11) = 1 + \tfrac53 + \tfrac{11}{6} = \tfrac{6 + 10 + 11}{6} = \tfrac{27}{6} = 4.5 .
The amplitudes squared must sum to 1 — that is just the statement that
some outcome is certain to occur. Always sanity-check normalisation before you trust an
expectation value.
Example 3 — checking a Hamiltonian eigenvalue. Take the free-particle Hamiltonian
\hat{H} = -\dfrac{\hbar^2}{2m}\dfrac{d^2}{dx^2} (so
V = 0) and the plane wave \psi = e^{ikx}. Two
derivatives bring down (ik)^2 = -k^2:
\hat{H}\,e^{ikx} = -\frac{\hbar^2}{2m}\,(ik)^2 e^{ikx} = -\frac{\hbar^2}{2m}(-k^2)\,e^{ikx} = \frac{\hbar^2 k^2}{2m}\,e^{ikx}.
So e^{ikx} is an energy eigenstate with eigenvalue
E = \hbar^2 k^2 / 2m — precisely the kinetic energy
p^2/2m of a particle with momentum p = \hbar k,
real and positive as an energy must be.
The spectral theorem: why this all hangs together
There is a deep reason the bookkeeping works. Because an observable's operator is Hermitian, the
spectral theorem
guarantees two priceless facts at once: its eigenvalues are real, and its eigenstates form a complete
orthonormal basis for the space of states. Orthonormal means the eigenstates are
mutually perpendicular and unit length, \langle \psi_m | \psi_n \rangle = \delta_{mn};
complete means any state can be written as a combination of them,
\psi = \sum_n c_n \psi_n , \qquad c_n = \langle \psi_n | \psi \rangle , \qquad \sum_n |c_n|^2 = 1 .
This is the spectral decomposition, and it is what makes measurement well-defined:
because the eigenstates are a basis, every possible state can be resolved into eigen-components, each
carrying a probability |c_n|^2 of being the outcome. Substituting the
expansion into the sandwich formula reproduces the weighted-average form,
\langle A \rangle = \sum_n |c_n|^2 a_n — the two definitions are one and the
same, guaranteed by the spectral theorem.
Two things must be true of anything you can physically measure, and Hermiticity delivers both in one
stroke. First, the results must be real numbers — a meter never reads an imaginary
value. For a Hermitian operator the eigenvalue equation forces
a_n = a_n^*, so every a_n is real. A non-Hermitian
operator can have complex eigenvalues and is therefore disqualified as an observable.
Second, the different outcomes must be cleanly distinguishable — measuring
"energy = 3" should not secretly overlap with "energy = 5". Hermiticity guarantees that eigenstates
belonging to different eigenvalues are orthogonal, so the possible answers are perpendicular directions
in state space with no leakage between them. That orthogonality is precisely what lets the Born rule
assign a clean probability |c_n|^2 to each outcome with the probabilities
summing to one. Hermiticity is not a technicality bolted on for convenience — it is the
mathematical content of "this quantity can be measured."
The classic trap. The name "expectation value" sounds like "the value you expect to
get", so students picture \langle A \rangle as a typical single reading. It
is nothing of the sort. \langle A \rangle is the average over an
ensemble of many identical measurements, and averages routinely fall in the gaps between the
allowed answers.
In Example 1 the only possible readings were 1\ \text{eV} and
3\ \text{eV}, yet \langle E \rangle = 2.5\ \text{eV}
— a number no single measurement can ever produce, because the observable has no eigenvalue there. It is
the very same statistic as "the average British household has 2.4 children": informative about the
population, impossible for any one household. A single measurement always lands on an eigenvalue; only
the long-run mean is \langle A \rangle. Never confuse one
measurement with the average of infinitely many.