Spectral Theory of Operators
In a first linear-algebra course you learned to diagonalise a matrix: find the
eigenvalues \lambda and eigenvectors of an
n \times n matrix A, and — when there are
enough of them — rewrite A as a stretch by \lambda_i
along each eigen-direction. For a symmetric real matrix this always works perfectly: the
spectral theorem for matrices hands you an orthonormal basis of eigenvectors and real eigenvalues.
The whole subject in front of you is the answer to one question:
what survives when the matrix becomes an infinite-dimensional operator?
The stakes are not academic. In quantum mechanics an observable — energy, position,
momentum — is a self-adjoint
operator
on a Hilbert space of states, and the numbers a laboratory can actually measure are exactly
the points of that operator's spectrum. The discrete energy levels of a hydrogen
atom, the allowed frequencies of a vibrating string, the resonances of a drum — each is a spectrum.
So "find the spectrum" is the infinite-dimensional cousin of "find the eigenvalues", and getting it
right is the difference between predicting a spectral line and not.
Here is the first surprise, and the organising idea of the whole page. In finite dimensions the set
of eigenvalues is the whole story: A - \lambda I fails to be
invertible precisely when it has a non-trivial kernel, i.e. precisely when
\lambda is an eigenvalue. In infinite dimensions
invertibility can fail without any kernel at all — an operator can be injective yet
still not invertible, because its inverse is unbounded or its range is too small. The spectrum is
therefore bigger than the eigenvalues, and learning to see that gap is the heart of the
matter.
The definition: invertibility, not eigenvalues
Fix a complex Banach space X \ne \{0\} and a
bounded linear operator
T : X \to X. Throughout, I is the identity and
we abbreviate T - \lambda I to T - \lambda. We
ask, for each complex number \lambda, a single yes/no question:
is T - \lambda invertible as a bounded operator? By the
bounded-inverse theorem, "invertible" here means bijective — a bounded inverse then comes for free.
-
The resolvent set is
\rho(T) = \{\, \lambda \in \mathbb{C} : T - \lambda \text{ is a bijection of } X \text{ onto } X \,\}.
For \lambda \in \rho(T) the bounded inverse
R_\lambda = (T - \lambda)^{-1} is the resolvent.
-
The spectrum is everything else:
\sigma(T) = \mathbb{C} \setminus \rho(T) = \{\, \lambda \in \mathbb{C} : T - \lambda \text{ is not invertible} \,\}.
-
A number \lambda is an eigenvalue when
T - \lambda is not injective, i.e.
(T - \lambda)x = 0 for some x \ne 0.
Every eigenvalue lies in \sigma(T) — but not conversely.
In finite dimensions a linear map is injective iff surjective iff invertible, so those three lines
collapse into one and \sigma(A) is just the eigenvalue list. The whole
richness of spectral theory comes from these implications breaking apart once
X is infinite-dimensional: an operator can be injective but not
surjective, or have dense-but-not-closed range, and each failure is a different way to land in the
spectrum.
Anatomy of the spectrum: three ways to fail
Because T - \lambda can miss invertibility for genuinely different
reasons, the spectrum splits into three disjoint pieces according to how the map fails. Ask
two questions: is T - \lambda injective, and is its range dense in
X?
-
Point spectrum \sigma_p(T).
T - \lambda is not injective. Then
\lambda is an honest eigenvalue with an eigenvector in
the kernel — the finite-dimensional notion, still present here.
-
Continuous spectrum \sigma_c(T).
T - \lambda is injective and has dense range, but that range
is not all of X. The inverse exists on a dense set but is
unbounded, so there is no bounded inverse: invertibility fails "at infinity", not
by a kernel.
-
Residual spectrum \sigma_r(T).
T - \lambda is injective but its range is not even dense.
There is a whole direction the operator's image never approaches.
\sigma(T) = \underbrace{\sigma_p(T)}_{\text{eigenvalues}} \;\sqcup\; \underbrace{\sigma_c(T)}_{\text{dense, not onto}} \;\sqcup\; \underbrace{\sigma_r(T)}_{\text{range not dense}} .
The single most important sentence on this page: in infinite dimensions
\sigma_c and \sigma_r can be non-empty even
when \sigma_p is empty. An operator may have no eigenvalues
whatsoever and still have a large spectrum. Keep this in view — the shift operator below is
exactly such a beast, and the misconception it kills is the classic exam trap.
What the spectrum always looks like
Before computing any example, three structural facts pin down where the spectrum can live. They are
proved with the Neumann series and a little complex analysis, and they hold for every
bounded operator on a complex Banach space.
-
Bounded. If |\lambda| > \lVert T \rVert then
T - \lambda = -\lambda\,(I - \lambda^{-1} T) is inverted by the
convergent Neumann series
-\lambda^{-1}\sum_{n \ge 0} (\lambda^{-1} T)^n. Hence
\sigma(T) \subseteq \{\, \lambda : |\lambda| \le \lVert T \rVert \,\}.
-
Closed, hence compact. The resolvent set is open (invertibility is stable under
small perturbations), so \sigma(T) is closed; being also bounded, it
is a compact subset of \mathbb{C}.
-
Non-empty. If \sigma(T) were empty, the resolvent
\lambda \mapsto R_\lambda would be a bounded entire operator-valued
function vanishing at infinity; Liouville's theorem forces it to be
0 — impossible. So \sigma(T) \ne \varnothing.
Compactness plus non-emptiness means the maximum in the next definition is genuinely attained.
r(T) = \max\{\, |\lambda| : \lambda \in \sigma(T) \,\} \qquad (\text{the } \textbf{spectral radius}).
The bound above says r(T) \le \lVert T \rVert. Remarkably, the spectral
radius is computable from the norms of powers of T alone, with
no reference to any \lambda — this is Gelfand's formula:
For every bounded operator on a complex Banach space,
r(T) = \lim_{n \to \infty} \lVert T^n \rVert^{1/n} = \inf_{n \ge 1} \lVert T^n \rVert^{1/n} \;\le\; \lVert T \rVert.
The limit exists (submultiplicativity makes the sequence essentially sub-additive after a
logarithm), and it can be strictly smaller than \lVert T \rVert.
A stark example: a nonzero nilpotent operator N with
N^2 = 0 has \lVert N \rVert > 0 but
\lVert N^n \rVert = 0 for n \ge 2, so
r(N) = 0 and \sigma(N) = \{0\}: a "big" operator
with a one-point spectrum. Gelfand's formula is what tells norm and spectral radius apart.
Worked example 1: a multiplication operator — spectrum = range
Let X = L^2[0,1] and let \varphi \in C[0,1] be
continuous. Define the multiplication operator
M_\varphi by
(M_\varphi f)(x) = \varphi(x)\,f(x). It is bounded with
\lVert M_\varphi \rVert = \max_x |\varphi(x)|. Claim:
\sigma(M_\varphi) = \varphi([0,1]) \quad (\text{the range of } \varphi).
Why. The operator M_\varphi - \lambda = M_{\varphi - \lambda}
is again multiplication, now by \varphi - \lambda. If
\lambda \notin \varphi([0,1]) then
\varphi - \lambda is bounded away from 0, so
1/(\varphi - \lambda) is continuous and bounded and multiplication by it
is a bounded inverse — thus \lambda \in \rho(M_\varphi). Conversely if
\lambda = \varphi(x_0), then \varphi - \lambda
is tiny on a small interval around x_0; multiplying by its "inverse" would
blow functions up without bound, so no bounded inverse exists and
\lambda \in \sigma(M_\varphi).
Now the subtle part. Is \lambda = \varphi(x_0) an eigenvalue? We
would need (\varphi - \lambda) f = 0 a.e. with
f \ne 0 in L^2 — i.e.
f supported where \varphi = \lambda. If
\varphi hits the value \lambda only on a set of
measure zero (e.g. \varphi(x) = x, each value taken at a
single point), then no such f exists: there are no eigenvalues at
all, and the entire spectrum [0,1] is continuous
spectrum. So M_x on
L^2[0,1] has \sigma = [0,1] but
\sigma_p = \varnothing — our first operator whose spectrum is all
"continuous". This is precisely the position operator of a particle on a segment: its spectrum is the
continuum of positions it can be found at.
Worked example 2: the shift — a full disc with no eigenvalues
On \ell^2 = \ell^2(\mathbb{N}) the unilateral (right)
shift is
S(x_1, x_2, x_3, \dots) = (0, x_1, x_2, x_3, \dots).
It is an isometry: \lVert Sx \rVert = \lVert x \rVert, so
\lVert S \rVert = 1 and every point of
\sigma(S) satisfies |\lambda| \le 1.
No eigenvalues. Suppose Sx = \lambda x. Comparing
coordinates gives 0 = \lambda x_1 and
x_{k} = \lambda x_{k+1} for all k. If
\lambda = 0 then all x_k = 0; if
\lambda \ne 0 the first equation forces
x_1 = 0 and the recursion propagates 0 through
every coordinate. Either way x = 0, so
\sigma_p(S) = \varnothing: the shift has not a single
eigenvalue.
Yet the spectrum is the entire closed unit disc. Since
r(S) \le \lVert S \rVert = 1 we have
\sigma(S) \subseteq \overline{\mathbb{D}} = \{ |\lambda| \le 1 \}. The
reverse inclusion comes from the adjoint, the backward shift
S^*(x_1, x_2, \dots) = (x_2, x_3, \dots). For every
|\lambda| < 1 the vector
(1, \lambda, \lambda^2, \dots) \in \ell^2 is an eigenvector of
S^*, so the open disc lies in \sigma_p(S^*);
a general fact links the two, \sigma(S^*) = \overline{\sigma(S)}, and the
spectrum is closed, so \overline{\mathbb{D}} \subseteq \sigma(S). Putting
the pieces together:
\sigma(S) = \overline{\mathbb{D}}, \qquad \sigma_p(S) = \varnothing, \qquad \sigma_r(S) = \{ |\lambda| < 1 \}, \qquad \sigma_c(S) = \{ |\lambda| = 1 \}.
A two-dimensional continuum of spectrum, and every last point of it is continuous or residual — not
one eigenvalue among them.
The single most common error in a first spectral-theory course is to write
\sigma(T) = \{\text{eigenvalues of } T\}. That equation is a
finite-dimensional habit and it is false in general. The unilateral
shift is the counterexample to memorise: \sigma_p(S) = \varnothing yet
\sigma(S) is the whole closed unit disc. The eigenvalues are only the
point part \sigma_p; the continuous and residual parts can carry
the entire spectrum.
A companion misconception: "every operator has an eigenvalue" (true for complex matrices, since the
characteristic polynomial always has a root). In infinite dimensions this fails outright — the
shift and the multiplication operator M_x above have no
eigenvalues. What is guaranteed is only that the spectrum \sigma(T)
is non-empty; whether any of it is point spectrum is a separate, often negative, question.
The crown jewel: the spectral theorem for compact self-adjoint operators
Now we return home. Restrict to a Hilbert space H and to the friendliest
operators of all: those that are both
compact and
self-adjoint (T = T^*). Self-adjointness is the
infinite-dimensional analogue of a symmetric matrix; compactness is the analogue of being
"finite-rank up to a small error", and it is what forces the pathologies of the shift to disappear.
For this class the finite-dimensional dream comes true verbatim.
First, two facts that hold for any bounded self-adjoint T on
H — the same computations you saw for symmetric matrices, now with the
Hilbert inner product playing the role of the dot product.
-
Real spectrum. If T = T^* then
\sigma(T) \subseteq \mathbb{R}. For an eigenvalue,
\lambda \langle x, x \rangle = \langle Tx, x \rangle = \langle x, Tx \rangle = \overline{\lambda}\langle x, x\rangle,
forcing \lambda = \overline{\lambda}.
-
Orthogonal eigenspaces. Eigenvectors for distinct eigenvalues are
orthogonal: if Tx = \lambda x and
Ty = \mu y with \lambda \ne \mu, then
\lambda \langle x, y \rangle = \langle Tx, y\rangle = \langle x, Ty\rangle = \mu \langle x, y \rangle,
so \langle x, y \rangle = 0.
-
Norm = spectral radius. For self-adjoint (indeed any normal) operators Gelfand's
formula tightens to an equality: r(T) = \lVert T \rVert, and moreover at
least one of \pm \lVert T \rVert is in the spectrum.
Let T be a compact self-adjoint operator on a
Hilbert space
H \ne \{0\}. Then:
-
the nonzero eigenvalues form a (finite or countable) set of real numbers
(\lambda_n), each of finite multiplicity, whose only possible
accumulation point is 0; if there are infinitely many, then
\lambda_n \to 0;
-
there is an orthonormal basis (e_n) of
\overline{\operatorname{ran} T} (extend by a basis of
\ker T to cover all of H) consisting of
eigenvectors, T e_n = \lambda_n e_n;
-
T is diagonalised by this basis:
Tx = \sum_{n} \lambda_n \langle x, e_n \rangle\, e_n \qquad \text{for every } x \in H,
the series converging in norm.
Read that last line beside the matrix case. Diagonalising a symmetric matrix
A = \sum_i \lambda_i\, e_i e_i^{\mathsf T} says
Ax = \sum_i \lambda_i (e_i^{\mathsf T} x)\, e_i — decompose
x along the orthonormal eigenbasis, scale the
i-th component by \lambda_i, reassemble. The
spectral theorem is that same sentence with the finite sum replaced by a convergent series
and the dot product replaced by \langle x, e_n \rangle. Compactness is the
hypothesis that makes the eigenvalues pile up only at 0 and lets the sum
converge; without it (the shift again) there may be no eigenbasis to expand in.
The spectrum itself is then completely transparent:
\sigma(T) = \{0\} \cup \{\lambda_n\} — the eigenvalues together with their
limit point 0 (which sits in the spectrum whenever
H is infinite-dimensional, since a compact operator can never be
invertible there). A discrete constellation of real eigenvalues marching in toward the origin: that
is the picture below.
Seeing the spectrum in the complex plane
Every spectrum lives inside the closed disc |\lambda| \le \lVert T \rVert
(normalise \lVert T \rVert = 1, so it is the unit disc), is compact, and
is non-empty. Within that frame the three examples above look utterly different. Flip between them:
-
Compact self-adjoint — a discrete string of real eigenvalues (points on
the horizontal axis) accumulating at the origin, exactly as the spectral theorem promises. The
largest one sits on the boundary because r(T) = \lVert T \rVert.
-
Unilateral shift — the entire filled disc, though not one point of it is
an eigenvalue. This is the "watch out!" picture: maximal spectrum, empty point spectrum.
-
Multiplication by e^{2\pi i x} — the spectrum is the
range of the multiplier, the unit circle: a one-dimensional curve of spectrum, no
interior, no eigenvalues.
Where this pays off
In quantum mechanics the state of a system is a unit vector
\psi in a Hilbert space, and each observable — energy, momentum, spin —
is a self-adjoint operator A. The postulate that makes
the theory match experiments is spectral: the only possible outcomes of measuring
A are the points of \sigma(A).
Self-adjointness guarantees \sigma(A) \subseteq \mathbb{R} — measured
values are real, as they must be. When the spectrum is a discrete set of eigenvalues (a bound
electron), you see quantised energy levels and sharp spectral lines; when it is a
continuum (a free particle, our M_x), the observable varies
continuously. The word "spectrum" in "spectral line" and in "spectrum of an operator" is
the same word — the atomic spectroscopists and the functional analysts are describing one
object, the eigenvalues of the Hamiltonian.
A string of length L fixed at both ends vibrates according to
-u'' = \lambda u with u(0) = u(L) = 0. The
operator -\tfrac{d^2}{dx^2} with those boundary conditions is
self-adjoint, and its spectrum is the discrete set
\lambda_n = (n\pi/L)^2 with eigenfunctions
e_n(x) = \sin(n\pi x / L). Those eigenvalues are the squared
frequencies of the fundamental and its overtones — the harmonic series you hear. Its inverse (the
Green's operator of the string) is compact and self-adjoint, and its
\sin(n\pi x/L) eigenbasis is exactly the orthonormal eigenbasis the
spectral theorem guarantees. Fourier series, from this angle, is just the spectral theorem applied
to the second-derivative operator: expanding a shape in vibrational modes is diagonalising
an operator. Ask instead "can you hear the shape of a drum?" and you are asking whether the
spectrum determines the domain — a deep and only partly-answered question.