The Density Matrix

A state vector |\psi\rangle is a wonderfully compact way to describe a qubit — but it can only describe a pure state, one you know exactly. Real life is messier. Suppose a lab technician hands you a qubit and says, "it's |0\rangle half the time and |1\rangle the other half, but I've lost track of which this one is." That is ordinary classical uncertainty, and no single vector |\psi\rangle can capture it. The same problem appears when you look at just one half of an entangled pair: the piece in your hand has no state vector of its own.

The fix is to upgrade from a vector to a matrix. The density matrix \rho (Greek "rho") describes any quantum state — pure or mixed, whole system or subsystem — in one uniform object. For a pure state it is the outer product of the vector with itself:

\rho = |\psi\rangle\langle\psi|.

Where \langle\psi|\psi\rangle (the inner product) collapses a vector to a number, |\psi\rangle\langle\psi| (the outer product) fans it out into a matrix.

The density matrix of a pure state

To build \rho = |\psi\rangle\langle\psi|, multiply the column |\psi\rangle by the row \langle\psi| (its conjugate transpose). For the basis state |0\rangle = \begin{bmatrix} 1 \\ 0 \end{bmatrix}:

\rho_0 = |0\rangle\langle 0| = \begin{bmatrix} 1 \\ 0 \end{bmatrix}\begin{bmatrix} 1 & 0 \end{bmatrix} = \begin{bmatrix} 1 & 0 \\ 0 & 0 \end{bmatrix}.

A single 1 on the diagonal — "this qubit is definitely |0\rangle." Now the balanced superposition |{+}\rangle = \tfrac{1}{\sqrt2}\big(|0\rangle + |1\rangle\big). Its column is \tfrac{1}{\sqrt2}\begin{bmatrix} 1 \\ 1 \end{bmatrix}, so

\rho_{+} = |{+}\rangle\langle{+}| = \tfrac{1}{2}\begin{bmatrix} 1 \\ 1 \end{bmatrix}\begin{bmatrix} 1 & 1 \end{bmatrix} = \begin{bmatrix} \tfrac12 & \tfrac12 \\[2pt] \tfrac12 & \tfrac12 \end{bmatrix}.

Every entry is \tfrac12. Those off-diagonal terms are called coherences, and — hold that thought — they are the fingerprint of a genuine superposition.

Mixing states: the ensemble

Now the classical uncertainty. An ensemble is a list of states |\psi_i\rangle each prepared with some probability p_i (with \sum_i p_i = 1). Its density matrix is the probability-weighted average of the individual pure density matrices:

\rho = \sum_i p_i\, |\psi_i\rangle\langle\psi_i|.

A state that needs a genuine mixture — that is not just one |\psi\rangle\langle\psi| — is called mixed. Take our technician's qubit: a 50/50 classical mixture of |0\rangle and |1\rangle. Add the two pure density matrices with weight \tfrac12 each:

\rho = \tfrac12|0\rangle\langle0| + \tfrac12|1\rangle\langle1| = \tfrac12\begin{bmatrix} 1 & 0 \\ 0 & 0 \end{bmatrix} + \tfrac12\begin{bmatrix} 0 & 0 \\ 0 & 1 \end{bmatrix} = \begin{bmatrix} \tfrac12 & 0 \\[2pt] 0 & \tfrac12 \end{bmatrix} = \tfrac12 I.

This is the maximally mixed state — the state of total ignorance, half of the identity matrix. Notice its off-diagonal entries are zero: no coherences.

Two states that look the same — and aren't

Here is the punchline. Measure |{+}\rangle in the computational basis and you get 0 or 1, 50/50. Measure the mixed \tfrac12 I and you also get 0 or 1, 50/50. In that one experiment they are indistinguishable — yet their density matrices are visibly different:

\rho_{+} = \begin{bmatrix} \tfrac12 & \tfrac12 \\[2pt] \tfrac12 & \tfrac12 \end{bmatrix} \;\neq\; \tfrac12 I = \begin{bmatrix} \tfrac12 & 0 \\[2pt] 0 & \tfrac12 \end{bmatrix}.

The difference lives entirely in the coherences. And they are physically real: change basis before measuring — apply a Hadamard and measure — and |{+}\rangle gives 0 with certainty, while the coin still splits 50/50. The density matrix remembers what a single computational-basis measurement forgets.

It is tempting to read |{+}\rangle as "the qubit is secretly 0 or 1, I just haven't peeked." That is exactly the mixed state \tfrac12 I — a coin lying face down. But |{+}\rangle is something else: a definite quantum state whose two branches can interfere. The coherences let the amplitudes add or cancel under a change of basis, so |{+}\rangle becomes perfectly predictable in the \pm basis. The face-down coin can never be made predictable by any basis change — there is nothing to interfere. Interference is the experimental line between "a definite superposition" and "mere ignorance," and the density matrix draws it: coherences present versus coherences gone.

What makes a matrix a density matrix

Not every matrix describes a state. A density matrix is exactly one that is:

Hermitian, \rho^\dagger = \rho — like every observable;
unit trace, \operatorname{Tr}(\rho) = 1 — the probabilities on the diagonal sum to one;
positive semidefinite, \rho \succeq 0 — no negative eigenvalues, so no negative probabilities.

Two more facts make \rho the tool of choice. First, a clean test for purity: a state is pure exactly when

\operatorname{Tr}(\rho^2) = 1, \qquad\text{and mixed when } \operatorname{Tr}(\rho^2) < 1.

The quantity \operatorname{Tr}(\rho^2) is called the purity; for a qubit it runs from 1 (pure) down to \tfrac12 (maximally mixed). Second, every measurement prediction comes from one formula — the expected value of an observable A is

\langle A \rangle = \operatorname{Tr}(\rho A).

Worked example: pure or mixed?

Test \rho_{+} and the coin with the purity rule. A pure density matrix is a projector, so \rho_{+}^2 = \rho_{+} and

\operatorname{Tr}(\rho_{+}^2) = \operatorname{Tr}(\rho_{+}) = 1 \quad\Rightarrow\quad \text{pure.}\ \checkmark

For the coin, \big(\tfrac12 I\big)^2 = \tfrac14 I, whose trace is \tfrac14 + \tfrac14 = \tfrac12:

\operatorname{Tr}\!\big[(\tfrac12 I)^2\big] = \tfrac12 < 1 \quad\Rightarrow\quad \text{mixed.}

The number itself is meaningful: \tfrac12 is the smallest purity a qubit can have, so this really is the most mixed a single qubit ever gets.

Worked example: an expectation value

Let the observable be Z = \begin{bmatrix} 1 & 0 \\ 0 & -1 \end{bmatrix} (which reads +1 on |0\rangle and -1 on |1\rangle). Its average in the mixed state \rho = \tfrac12 I is

\langle Z \rangle = \operatorname{Tr}(\rho Z) = \operatorname{Tr}\!\left(\begin{bmatrix} \tfrac12 & 0 \\ 0 & -\tfrac12 \end{bmatrix}\right) = \tfrac12 - \tfrac12 = 0.

Zero — a perfect 50/50 balance of +1 and -1 outcomes, exactly as expected. The single formula \langle A \rangle = \operatorname{Tr}(\rho A) handles pure and mixed states alike, which is precisely why physicists reach for the density matrix.

a state is described by a matrix \rho that is Hermitian, has unit trace \operatorname{Tr}(\rho)=1, and is positive semidefinite;
a pure state is \rho = |\psi\rangle\langle\psi|; a general ensemble is \rho = \sum_i p_i |\psi_i\rangle\langle\psi_i|;
purity test: pure iff \operatorname{Tr}(\rho^2)=1, mixed if \operatorname{Tr}(\rho^2)<1;
every prediction is one formula: \langle A \rangle = \operatorname{Tr}(\rho A);
coherences (off-diagonal terms) distinguish a superposition from a classical mixture, even when both give the same measurement odds.

Two slips to avoid. First, purity is \operatorname{Tr}(\rho^2)=1, not \operatorname{Tr}(\rho)=1 — the latter is true for every density matrix, pure or mixed, so it tells you nothing about purity. Always square first. Second, "diagonal" is not the same as "mixed." The pure state \rho_0 = |0\rangle\langle0| = \operatorname{diag}(1,0) is perfectly diagonal yet perfectly pure (\operatorname{Tr}(\rho_0^2) = 1). What signals a classical mixture is not the shape of the matrix but its purity dropping below 1 — equivalently, having two or more non-zero eigenvalues. Diagonal with a single 1 is pure; diagonal with two halves is mixed.