Compact Operators
Solve the equation Ax = b for an
n \times n matrix and life is easy: eigenvalues, determinants, the
rank–nullity theorem, the Fredholm alternative — all the machinery of linear algebra is at your
service. Now replace the finite list of unknowns by a whole function, and the matrix by an
integral operator
(Tf)(x) = \int_a^b K(x, t)\, f(t)\, dt,
so that "solve f - \lambda T f = g" is now an integral
equation — the continuous cousin of a linear system, and the shape of countless problems in
physics and engineering (heat flow, scattering, the reconstruction of a signal from blurred data).
The miracle discovered by Fredholm, Hilbert and Riesz around 1900 is that a huge chunk of the
finite-dimensional toolkit still works for these infinite-dimensional problems — the same
Fredholm alternative, the same eigenvalue expansions — provided the operator is
compact.
Compact operators are the class of
bounded operators
that behave as if they were almost finite-dimensional. They are the honest
infinite-dimensional analogue of a matrix — the operators for which spectra are discrete, eigenvectors
can be listed, and analysis feels like linear algebra again. This page is about pinning down exactly
which operators earn that privilege, and why the innocent-looking identity operator
is spectacularly not among them.
The definition: squeeze the unit ball into something compact
Recall from
compactness that a subset of a metric
space is compact when every sequence in it has a subsequence converging to a point
of the set — equivalently (in a complete space) when it is closed and totally
bounded. A set is relatively compact (or precompact) when its
closure is compact; that is the property we will demand of an operator's output.
Let X, Y be normed spaces and write
B_X = \{\, x \in X : \|x\| \le 1 \,\} for the closed unit ball of
X.
A linear operator T : X \to Y is compact if it
satisfies either of these equivalent conditions:
-
(Ball form) the image T(B_X) of the closed unit ball
is a relatively compact subset of Y — its closure
\overline{T(B_X)} is compact.
-
(Sequential form) for every bounded sequence
(x_n) in X, the image sequence
(T x_n) has a convergent subsequence in
Y.
When Y is a Banach space, "relatively compact" is the same as
"totally bounded": for every \varepsilon > 0,
T(B_X) is covered by finitely many \varepsilon-balls.
The two forms are the same statement dressed differently. If T(B_X) has
compact closure, any bounded (x_n) — rescale so
\|x_n\| \le 1 — lands with (T x_n) inside the
compact set \overline{T(B_X)}, where sequential compactness hands back a
convergent subsequence. Conversely, the sequential condition is exactly sequential compactness of the
closure. Notice the definition bakes in boundedness: a compact operator is a bounded
operator whose action on the unit ball is not merely bounded but precompact — a much stronger
squeeze. Compactness sits strictly between "bounded" and "finite-rank":
\text{finite rank} \;\subsetneq\; \text{compact} \;\subsetneq\; \text{bounded}.
Why "almost finite-dimensional": finite-rank operators are compact
The prototype of a compact operator is one whose range is finite-dimensional. An operator
T has finite rank if
\dim T(X) < \infty; a matrix is the finite-dimensional case of exactly
this. Every bounded finite-rank operator is compact, and the reason is a single classical fact you
already own.
- If T : X \to Y is bounded with
\dim T(X) < \infty, then T is compact.
Proof. Because T is bounded and
\|x\| \le 1, the set T(B_X) is a
bounded subset of the finite-dimensional space
V = T(X) \cong \mathbb{F}^k. In a finite-dimensional normed space the
Heine–Borel theorem holds: a set is relatively compact precisely when it is bounded
(this is Bolzano–Weierstrass, coordinate by coordinate). So T(B_X) has
compact closure inside V, and T is compact.
\blacksquare
This is the whole slogan made precise. A finite-rank operator collapses all of
X into a finite-dimensional shadow, where boundedness alone buys you
compactness for free. A general compact operator is one you can approximate as closely as you
like by such finite-rank shadows — literally a limit of matrices — which is why it inherits so much of
their good behaviour. (On a Hilbert space this approximation is a theorem: every compact operator is
the operator-norm limit of finite-rank ones. On a general Banach space it is Grothendieck's famous
approximation property question — true for every classical space, but not, astonishingly, for
every Banach space.)
The one operator that is not compact: the identity
In finite dimensions the identity is a perfectly ordinary matrix, so a beginner expects it to be
compact. In infinite dimensions this fails — and the failure is the beating heart of the whole
subject. The identity I : X \to X maps the unit ball to
itself, so asking "is I compact?" is asking "is the closed
unit ball of X compact?" — and the answer separates finite from
infinite dimensions cleanly in two.
-
Riesz's lemma. If Y is a proper closed
subspace of a normed space X, then for every
\theta \in (0, 1) there is a unit vector
x_\theta \in X, \|x_\theta\| = 1, with
\operatorname{dist}(x_\theta, Y) \ge \theta.
-
Consequence (F. Riesz). The closed unit ball of a normed space
X is compact if and only if
X is finite-dimensional.
Why the ball is not compact in infinite dimensions. Suppose
\dim X = \infty. Build a sequence of unit vectors as follows. Pick any
x_1 with \|x_1\| = 1. Given
x_1, \dots, x_n, their span Y_n is a
finite-dimensional — hence closed — proper subspace (the space is infinite-dimensional, so
there is room left over). Riesz's lemma with \theta = \tfrac{1}{2} supplies
a unit vector x_{n+1} at distance
\ge \tfrac{1}{2} from Y_n, so in particular
\|x_m - x_n\| \ge \tfrac{1}{2} \qquad \text{for all } m \ne n.
This (x_n) lives on the unit sphere, yet no two of its terms are within
\tfrac{1}{2} of each other — it can have no Cauchy subsequence,
hence no convergent one. The closed unit ball is therefore not sequentially compact. Since
I(B_X) = B_X, the identity is not a compact operator on
any infinite-dimensional space. The tameest, most trivial bounded operator there is — norm exactly
1 — fails the compactness test, precisely because infinite-dimensional
balls have "too many independent directions" to be squeezed into anything compact.
This is the single most common source of error for newcomers, because in
\mathbb{R}^n "closed and bounded \Rightarrow
compact" (Heine–Borel) is drilled in so hard it feels like a law of nature. It is not — it is a law
of finite dimensions only. In infinite dimensions the closed unit ball is closed and
bounded and still not compact, as the Riesz construction above shows explicitly.
Two corollaries worth memorising:
-
Compact \ne bounded. Every compact operator is
bounded, but bounded operators are almost never compact — the identity, every isometry, every
invertible operator on an infinite-dimensional space fails. Compactness is a rare and
special property, not a mild one.
-
An operator with a bounded inverse cannot be compact (on an
infinite-dimensional space). If T were compact with a bounded inverse
S = T^{-1}, then I = ST would be compact
(composition with a bounded operator preserves compactness — see below), contradicting the
theorem. So no compact operator on an infinite-dimensional space is invertible: its range is
always a "thin", proper subset.
The algebraic shape: a closed two-sided ideal
The compact operators are not a random collection — inside the algebra
B(X) of all bounded operators they occupy a very specific structural niche.
Write K(X, Y) for the compact operators X \to Y,
and K(X) = K(X, X).
Let X be a Banach space. Then:
-
Vector subspace. A sum of compact operators is compact, and any scalar multiple
of a compact operator is compact; so K(X, Y) is a linear subspace of
B(X, Y).
-
Norm-closed. If each T_n is compact and
\|T_n - T\| \to 0, then T is compact — a
uniform limit of compact operators is compact.
-
Two-sided ideal. If T is compact and
S is bounded, then both ST and
TS are compact.
The ideal property (the easy, illuminating part). Let
(x_n) be bounded in X.
-
For TS: S bounded makes
(S x_n) bounded, and then compactness of T
extracts a subsequence with (T S x_n) convergent. So
TS is compact.
-
For ST: compactness of T gives a subsequence
with T x_{n_k} \to y; then S is continuous, so
S T x_{n_k} \to S y converges. So ST is
compact.
Norm-closedness (the totally-bounded argument). Take
\varepsilon > 0 and choose n with
\|T - T_n\| < \varepsilon/2. Since T_n is
compact, T_n(B_X) is totally bounded: cover it by finitely many balls of
radius \varepsilon/2, centred at
T_n y_1, \dots, T_n y_m. For any
x \in B_X, pick the centre with
\|T_n x - T_n y_j\| < \varepsilon/2; then
\|Tx - T y_j\| \le \|Tx - T_n x\| + \|T_n x - T_n y_j\| + \|T_n y_j - T y_j\| < \tfrac{\varepsilon}{2} + \tfrac{\varepsilon}{2}\cdot 0^{\!+} + \tfrac{\varepsilon}{2} \le \varepsilon,
using \|T - T_n\| < \varepsilon/2 on the first and third terms. So the same
finite set of points \varepsilon-covers T(B_X):
it is totally bounded, hence T is compact.
\blacksquare This "\varepsilon/2 three ways"
estimate is the standard proof that totally bounded is a closed condition, and it is exactly
why finite-rank operators, when they converge in norm, produce compact limits.
Because K(X) is a closed, proper two-sided ideal of the Banach
algebra B(X), the quotient
B(X) / K(X) is itself a Banach algebra — the Calkin
algebra — in which "compact" becomes "zero". Working modulo compact operators is the algebraic
engine behind Fredholm theory: an operator is Fredholm exactly when it is
invertible in the Calkin algebra, i.e. invertible up to a compact error.
The two canonical examples
Two families of operators appear on every functional-analysis exam, and between them they capture the
whole flavour of compactness.
1. Diagonal (multiplier) operators on \ell^2
Fix a bounded sequence (\lambda_n) and define, on the sequence space
\ell^2,
T(x_1, x_2, x_3, \dots) = (\lambda_1 x_1,\, \lambda_2 x_2,\, \lambda_3 x_3, \dots).
This is the infinite-dimensional diagonal matrix. It is bounded with operator norm
\|T\| = \sup_n |\lambda_n| (the eigenvalues are the
\lambda_n, with eigenvectors the standard basis
e_n). Its compactness is decided by a single clean criterion:
- The diagonal operator T on \ell^2 is
compact if and only if \lambda_n \to 0.
If \lambda_n \to 0, T is compact.
Truncate: let T_N keep only the first N entries
(\lambda_1, \dots, \lambda_N, 0, 0, \dots). Each
T_N has finite rank N, hence is compact, and
\|T - T_N\| = \sup_{n > N} |\lambda_n| \longrightarrow 0 \quad \text{as } N \to \infty,
exactly because \lambda_n \to 0. So T is a
norm-limit of finite-rank operators, and by the closed-ideal theorem it is compact.
If \lambda_n \not\to 0, T is not
compact. Then some \delta > 0 is exceeded infinitely often:
|\lambda_{n_k}| \ge \delta. The unit vectors
e_{n_k} are bounded, but
\|T e_{n_k} - T e_{n_j}\|^2 = |\lambda_{n_k}|^2 + |\lambda_{n_j}|^2 \ge 2\delta^2
for k \ne j (orthogonality), so (T e_{n_k}) has
no convergent subsequence — precisely the identity's failure again. In particular the identity on
\ell^2 is the case \lambda_n \equiv 1, which
emphatically does not tend to 0.
2. Integral operators with a continuous kernel
Back to the operator we started with: on C[a, b] (or
L^2[a, b]), a continuous kernel
K(x, t) on the square [a, b]^2 defines
(Tf)(x) = \int_a^b K(x, t)\, f(t)\, dt.
Such a T is compact. The reason is the
Arzelà–Ascoli theorem: as f ranges over the unit ball, the
images Tf are uniformly bounded and equicontinuous (continuity of
K on a compact square is uniform, which controls
|(Tf)(x) - (Tf)(x')| independently of f), and a
uniformly bounded equicontinuous family is relatively compact in the sup norm. On
L^2 these are the Hilbert–Schmidt operators, compact
whenever \iint |K(x,t)|^2\,dx\,dt < \infty; their
Hilbert–Schmidt norm is \|T\|_{HS} = \big(\iint |K|^2\big)^{1/2},
which for a diagonal operator is just
\big(\sum_n |\lambda_n|^2\big)^{1/2}. This is the concrete reason the
integral equations of physics are so tractable: their operators are compact, so the whole eigenvalue
theory applies.
See it: the tail must collapse to zero
Here is compactness for a diagonal operator, made visible. Each bar is one eigenvalue
\lambda_n = n^{-p} of the operator
T = \operatorname{diag}(\lambda_1, \lambda_2, \dots) on
\ell^2, standing at its index n. Think of
\lambda_n as "how much T stretches the
n-th coordinate axis": the picture is really a portrait of what
T does to the unit ball, axis by axis.
Drag the decay rate p. At
p = 0 every bar sits at height 1 — this is the
identity, which stretches every axis equally and squashes nothing; infinitely many
bars stay above any threshold, and T is not compact. Push
p > 0 and the tail collapses toward 0: only
finitely many coordinates are stretched by more than \varepsilon
(highlighted), all the rest are crushed flat, and the image of the unit ball becomes
relatively compact — T is compact. Slide
the threshold \varepsilon and watch the count of "surviving" coordinates: it
is always finite once p > 0, however small you make
\varepsilon — the operational meaning of "almost finite-dimensional".
The pay-off to come: spectral theory
Why insist on this class at all? Because compactness is exactly the hypothesis that resurrects
eigenvalues in infinite dimensions. A general bounded operator can have a spectrum that is a fat
continuum with no eigenvectors in sight (the shift operator, multiplication operators). Compact
operators cannot misbehave like that — their spectrum is as discrete and matrix-like as you could hope.
The crowning theorem — the spectral theorem for compact self-adjoint operators,
which the next lessons build toward — says this. If T is a compact
self-adjoint operator on an infinite-dimensional Hilbert space H, then:
-
its non-zero spectrum consists entirely of eigenvalues, each of finite multiplicity —
just like a symmetric matrix;
-
these real eigenvalues \lambda_1, \lambda_2, \dots form a sequence
whose only possible accumulation point is 0 (so for
every \varepsilon > 0, only finitely many
|\lambda_n| \ge \varepsilon);
-
H has an orthonormal basis of eigenvectors of
T, so T = \sum_n \lambda_n \langle \cdot, e_n\rangle e_n
is literally an infinite diagonal matrix.
You have already met the whole picture: the diagonal example above is the general case in
disguise, and the interactive figure's "eigenvalues collapsing to 0" is
the accumulation-at-0 statement drawn out. This is what makes integral
equations solvable by eigenfunction expansions — Fourier series, Sturm–Liouville theory, the modes
of a vibrating drum — all of them are the spectral theorem for a compact (inverse) operator, cashing
in the fact that the operator is "almost finite-dimensional".