The Density Matrix
A state vector |\psi\rangle is a wonderfully compact way to describe a
qubit — but it can only describe a pure state, one you know exactly. Real life is
messier. Suppose a lab technician hands you a qubit and says, "it's |0\rangle
half the time and |1\rangle the other half, but I've lost track of which
this one is." That is ordinary classical uncertainty, and no single vector
|\psi\rangle can capture it. The same problem appears when you look at just
one half of an entangled
pair: the piece in your hand has no state vector of its own.
The fix is to upgrade from a vector to a matrix. The density matrix
\rho (Greek "rho") describes any quantum state — pure or mixed,
whole system or subsystem — in one uniform object. For a pure state it is the
outer product of the vector with itself:
\rho = |\psi\rangle\langle\psi|.
Where \langle\psi|\psi\rangle (the inner product) collapses a
vector to a number, |\psi\rangle\langle\psi| (the outer product)
fans it out into a matrix.
The density matrix of a pure state
To build \rho = |\psi\rangle\langle\psi|, multiply the column
|\psi\rangle by the row \langle\psi| (its
conjugate transpose).
For the basis state |0\rangle = \begin{bmatrix} 1 \\ 0 \end{bmatrix}:
\rho_0 = |0\rangle\langle 0| = \begin{bmatrix} 1 \\ 0 \end{bmatrix}\begin{bmatrix} 1 & 0 \end{bmatrix} = \begin{bmatrix} 1 & 0 \\ 0 & 0 \end{bmatrix}.
A single 1 on the diagonal — "this qubit is definitely
|0\rangle." Now the balanced superposition
|{+}\rangle = \tfrac{1}{\sqrt2}\big(|0\rangle + |1\rangle\big). Its column
is \tfrac{1}{\sqrt2}\begin{bmatrix} 1 \\ 1 \end{bmatrix}, so
\rho_{+} = |{+}\rangle\langle{+}| = \tfrac{1}{2}\begin{bmatrix} 1 \\ 1 \end{bmatrix}\begin{bmatrix} 1 & 1 \end{bmatrix} = \begin{bmatrix} \tfrac12 & \tfrac12 \\[2pt] \tfrac12 & \tfrac12 \end{bmatrix}.
Every entry is \tfrac12. Those off-diagonal terms are
called coherences, and — hold that thought — they are the fingerprint of a genuine
superposition.
Mixing states: the ensemble
Now the classical uncertainty. An ensemble is a list of states
|\psi_i\rangle each prepared with some probability
p_i (with \sum_i p_i = 1). Its density matrix is
the probability-weighted average of the individual pure density matrices:
\rho = \sum_i p_i\, |\psi_i\rangle\langle\psi_i|.
A state that needs a genuine mixture — that is not just one
|\psi\rangle\langle\psi| — is called mixed. Take our
technician's qubit: a 50/50 classical mixture of |0\rangle
and |1\rangle. Add the two pure density matrices with weight
\tfrac12 each:
\rho = \tfrac12|0\rangle\langle0| + \tfrac12|1\rangle\langle1| = \tfrac12\begin{bmatrix} 1 & 0 \\ 0 & 0 \end{bmatrix} + \tfrac12\begin{bmatrix} 0 & 0 \\ 0 & 1 \end{bmatrix} = \begin{bmatrix} \tfrac12 & 0 \\[2pt] 0 & \tfrac12 \end{bmatrix} = \tfrac12 I.
This is the maximally mixed state — the state of total ignorance, half of the
identity matrix. Notice its off-diagonal entries are zero: no coherences.
Two states that look the same — and aren't
Here is the punchline. Measure |{+}\rangle in the computational basis and
you get 0 or 1, 50/50. Measure the mixed
\tfrac12 I and you also get 0 or
1, 50/50. In that one experiment they are indistinguishable — yet their
density matrices are visibly different:
\rho_{+} = \begin{bmatrix} \tfrac12 & \tfrac12 \\[2pt] \tfrac12 & \tfrac12 \end{bmatrix} \;\neq\; \tfrac12 I = \begin{bmatrix} \tfrac12 & 0 \\[2pt] 0 & \tfrac12 \end{bmatrix}.
The difference lives entirely in the coherences. And they are physically real: change
basis before measuring — apply a Hadamard
and measure — and |{+}\rangle gives 0 with
certainty, while the coin still splits 50/50. The density matrix remembers what a single
computational-basis measurement forgets.
It is tempting to read |{+}\rangle as "the qubit is secretly
0 or 1, I just haven't peeked." That is exactly
the mixed state \tfrac12 I — a coin lying face down. But
|{+}\rangle is something else: a definite quantum state whose two
branches can interfere. The coherences let the amplitudes add or cancel under a
change of basis, so |{+}\rangle becomes perfectly predictable in the
\pm basis. The face-down coin can never be made predictable by any basis
change — there is nothing to interfere. Interference is the experimental line between "a definite
superposition" and "mere ignorance," and the density matrix draws it: coherences present versus
coherences gone.
What makes a matrix a density matrix
Not every matrix describes a state. A density matrix is exactly one that is:
- Hermitian, \rho^\dagger = \rho — like every
observable;
- unit trace, \operatorname{Tr}(\rho) = 1 — the
probabilities on the diagonal sum to one;
- positive semidefinite, \rho \succeq 0 — no negative
eigenvalues, so no negative probabilities.
Two more facts make \rho the tool of choice. First, a clean test for
purity: a state is pure exactly when
\operatorname{Tr}(\rho^2) = 1, \qquad\text{and mixed when } \operatorname{Tr}(\rho^2) < 1.
The quantity \operatorname{Tr}(\rho^2) is called the purity;
for a qubit it runs from 1 (pure) down to \tfrac12
(maximally mixed). Second, every measurement prediction comes from one formula — the expected value of
an observable
A is
\langle A \rangle = \operatorname{Tr}(\rho A).
Worked example: pure or mixed?
Test \rho_{+} and the coin with the purity rule. A pure density matrix is a
projector, so \rho_{+}^2 = \rho_{+} and
\operatorname{Tr}(\rho_{+}^2) = \operatorname{Tr}(\rho_{+}) = 1 \quad\Rightarrow\quad \text{pure.}\ \checkmark
For the coin, \big(\tfrac12 I\big)^2 = \tfrac14 I, whose trace is
\tfrac14 + \tfrac14 = \tfrac12:
\operatorname{Tr}\!\big[(\tfrac12 I)^2\big] = \tfrac12 < 1 \quad\Rightarrow\quad \text{mixed.}
The number itself is meaningful: \tfrac12 is the smallest purity a qubit
can have, so this really is the most mixed a single qubit ever gets.
Worked example: an expectation value
Let the observable be Z = \begin{bmatrix} 1 & 0 \\ 0 & -1 \end{bmatrix}
(which reads +1 on |0\rangle and
-1 on |1\rangle). Its average in the mixed state
\rho = \tfrac12 I is
\langle Z \rangle = \operatorname{Tr}(\rho Z) = \operatorname{Tr}\!\left(\begin{bmatrix} \tfrac12 & 0 \\ 0 & -\tfrac12 \end{bmatrix}\right) = \tfrac12 - \tfrac12 = 0.
Zero — a perfect 50/50 balance of +1 and -1
outcomes, exactly as expected. The single formula
\langle A \rangle = \operatorname{Tr}(\rho A) handles pure and mixed states
alike, which is precisely why physicists reach for the density matrix.
- a state is described by a matrix \rho that is Hermitian,
has unit trace \operatorname{Tr}(\rho)=1, and is
positive semidefinite;
- a pure state is \rho = |\psi\rangle\langle\psi|; a
general ensemble is \rho = \sum_i p_i |\psi_i\rangle\langle\psi_i|;
- purity test: pure iff \operatorname{Tr}(\rho^2)=1,
mixed if \operatorname{Tr}(\rho^2)<1;
- every prediction is one formula: \langle A \rangle = \operatorname{Tr}(\rho A);
- coherences (off-diagonal terms) distinguish a superposition from a classical mixture, even when
both give the same measurement odds.
Two slips to avoid. First, purity is \operatorname{Tr}(\rho^2)=1, not
\operatorname{Tr}(\rho)=1 — the latter is true for every density
matrix, pure or mixed, so it tells you nothing about purity. Always square first. Second, "diagonal" is
not the same as "mixed." The pure state \rho_0 = |0\rangle\langle0| = \operatorname{diag}(1,0)
is perfectly diagonal yet perfectly pure (\operatorname{Tr}(\rho_0^2) = 1).
What signals a classical mixture is not the shape of the matrix but its purity dropping below
1 — equivalently, having two or more non-zero eigenvalues. Diagonal
with a single 1 is pure; diagonal with two halves is mixed.