Compact Operators

Solve the equation Ax = b for an n \times n matrix and life is easy: eigenvalues, determinants, the rank–nullity theorem, the Fredholm alternative — all the machinery of linear algebra is at your service. Now replace the finite list of unknowns by a whole function, and the matrix by an integral operator

(Tf)(x) = \int_a^b K(x, t)\, f(t)\, dt,

so that "solve f - \lambda T f = g" is now an integral equation — the continuous cousin of a linear system, and the shape of countless problems in physics and engineering (heat flow, scattering, the reconstruction of a signal from blurred data). The miracle discovered by Fredholm, Hilbert and Riesz around 1900 is that a huge chunk of the finite-dimensional toolkit still works for these infinite-dimensional problems — the same Fredholm alternative, the same eigenvalue expansions — provided the operator is compact.

Compact operators are the class of bounded operators that behave as if they were almost finite-dimensional. They are the honest infinite-dimensional analogue of a matrix — the operators for which spectra are discrete, eigenvectors can be listed, and analysis feels like linear algebra again. This page is about pinning down exactly which operators earn that privilege, and why the innocent-looking identity operator is spectacularly not among them.

The definition: squeeze the unit ball into something compact

Recall from compactness that a subset of a metric space is compact when every sequence in it has a subsequence converging to a point of the set — equivalently (in a complete space) when it is closed and totally bounded. A set is relatively compact (or precompact) when its closure is compact; that is the property we will demand of an operator's output.

Let X, Y be normed spaces and write B_X = \{\, x \in X : \|x\| \le 1 \,\} for the closed unit ball of X.

A linear operator T : X \to Y is compact if it satisfies either of these equivalent conditions:

(Ball form) the image T(B_X) of the closed unit ball is a relatively compact subset of Y — its closure \overline{T(B_X)} is compact.
(Sequential form) for every bounded sequence (x_n) in X, the image sequence (T x_n) has a convergent subsequence in Y.

When Y is a Banach space, "relatively compact" is the same as "totally bounded": for every \varepsilon > 0, T(B_X) is covered by finitely many \varepsilon-balls.

The two forms are the same statement dressed differently. If T(B_X) has compact closure, any bounded (x_n) — rescale so \|x_n\| \le 1 — lands with (T x_n) inside the compact set \overline{T(B_X)}, where sequential compactness hands back a convergent subsequence. Conversely, the sequential condition is exactly sequential compactness of the closure. Notice the definition bakes in boundedness: a compact operator is a bounded operator whose action on the unit ball is not merely bounded but precompact — a much stronger squeeze. Compactness sits strictly between "bounded" and "finite-rank":

\text{finite rank} \;\subsetneq\; \text{compact} \;\subsetneq\; \text{bounded}.

Why "almost finite-dimensional": finite-rank operators are compact

The prototype of a compact operator is one whose range is finite-dimensional. An operator T has finite rank if \dim T(X) < \infty; a matrix is the finite-dimensional case of exactly this. Every bounded finite-rank operator is compact, and the reason is a single classical fact you already own.

If T : X \to Y is bounded with \dim T(X) < \infty, then T is compact.

Proof. Because T is bounded and \|x\| \le 1, the set T(B_X) is a bounded subset of the finite-dimensional space V = T(X) \cong \mathbb{F}^k. In a finite-dimensional normed space the Heine–Borel theorem holds: a set is relatively compact precisely when it is bounded (this is Bolzano–Weierstrass, coordinate by coordinate). So T(B_X) has compact closure inside V, and T is compact. \blacksquare

This is the whole slogan made precise. A finite-rank operator collapses all of X into a finite-dimensional shadow, where boundedness alone buys you compactness for free. A general compact operator is one you can approximate as closely as you like by such finite-rank shadows — literally a limit of matrices — which is why it inherits so much of their good behaviour. (On a Hilbert space this approximation is a theorem: every compact operator is the operator-norm limit of finite-rank ones. On a general Banach space it is Grothendieck's famous approximation property question — true for every classical space, but not, astonishingly, for every Banach space.)

The one operator that is not compact: the identity

In finite dimensions the identity is a perfectly ordinary matrix, so a beginner expects it to be compact. In infinite dimensions this fails — and the failure is the beating heart of the whole subject. The identity I : X \to X maps the unit ball to itself, so asking "is I compact?" is asking "is the closed unit ball of X compact?" — and the answer separates finite from infinite dimensions cleanly in two.

Riesz's lemma. If Y is a proper closed subspace of a normed space X, then for every \theta \in (0, 1) there is a unit vector x_\theta \in X, \|x_\theta\| = 1, with \operatorname{dist}(x_\theta, Y) \ge \theta.
Consequence (F. Riesz). The closed unit ball of a normed space X is compact if and only if X is finite-dimensional.

Why the ball is not compact in infinite dimensions. Suppose \dim X = \infty. Build a sequence of unit vectors as follows. Pick any x_1 with \|x_1\| = 1. Given x_1, \dots, x_n, their span Y_n is a finite-dimensional — hence closed — proper subspace (the space is infinite-dimensional, so there is room left over). Riesz's lemma with \theta = \tfrac{1}{2} supplies a unit vector x_{n+1} at distance \ge \tfrac{1}{2} from Y_n, so in particular

\|x_m - x_n\| \ge \tfrac{1}{2} \qquad \text{for all } m \ne n.

This (x_n) lives on the unit sphere, yet no two of its terms are within \tfrac{1}{2} of each other — it can have no Cauchy subsequence, hence no convergent one. The closed unit ball is therefore not sequentially compact. Since I(B_X) = B_X, the identity is not a compact operator on any infinite-dimensional space. The tameest, most trivial bounded operator there is — norm exactly 1 — fails the compactness test, precisely because infinite-dimensional balls have "too many independent directions" to be squeezed into anything compact.

This is the single most common source of error for newcomers, because in \mathbb{R}^n "closed and bounded \Rightarrow compact" (Heine–Borel) is drilled in so hard it feels like a law of nature. It is not — it is a law of finite dimensions only. In infinite dimensions the closed unit ball is closed and bounded and still not compact, as the Riesz construction above shows explicitly.

Two corollaries worth memorising:

Compact \ne bounded. Every compact operator is bounded, but bounded operators are almost never compact — the identity, every isometry, every invertible operator on an infinite-dimensional space fails. Compactness is a rare and special property, not a mild one.
An operator with a bounded inverse cannot be compact (on an infinite-dimensional space). If T were compact with a bounded inverse S = T^{-1}, then I = ST would be compact (composition with a bounded operator preserves compactness — see below), contradicting the theorem. So no compact operator on an infinite-dimensional space is invertible: its range is always a "thin", proper subset.

The algebraic shape: a closed two-sided ideal

The compact operators are not a random collection — inside the algebra B(X) of all bounded operators they occupy a very specific structural niche. Write K(X, Y) for the compact operators X \to Y, and K(X) = K(X, X).

Let X be a Banach space. Then:

Vector subspace. A sum of compact operators is compact, and any scalar multiple of a compact operator is compact; so K(X, Y) is a linear subspace of B(X, Y).
Norm-closed. If each T_n is compact and \|T_n - T\| \to 0, then T is compact — a uniform limit of compact operators is compact.
Two-sided ideal. If T is compact and S is bounded, then both ST and TS are compact.

The ideal property (the easy, illuminating part). Let (x_n) be bounded in X.

For TS: S bounded makes (S x_n) bounded, and then compactness of T extracts a subsequence with (T S x_n) convergent. So TS is compact.
For ST: compactness of T gives a subsequence with T x_{n_k} \to y; then S is continuous, so S T x_{n_k} \to S y converges. So ST is compact.

Norm-closedness (the totally-bounded argument). Take \varepsilon > 0 and choose n with \|T - T_n\| < \varepsilon/2. Since T_n is compact, T_n(B_X) is totally bounded: cover it by finitely many balls of radius \varepsilon/2, centred at T_n y_1, \dots, T_n y_m. For any x \in B_X, pick the centre with \|T_n x - T_n y_j\| < \varepsilon/2; then

\|Tx - T y_j\| \le \|Tx - T_n x\| + \|T_n x - T_n y_j\| + \|T_n y_j - T y_j\| < \tfrac{\varepsilon}{2} + \tfrac{\varepsilon}{2}\cdot 0^{\!+} + \tfrac{\varepsilon}{2} \le \varepsilon,

using \|T - T_n\| < \varepsilon/2 on the first and third terms. So the same finite set of points \varepsilon-covers T(B_X): it is totally bounded, hence T is compact. \blacksquare This "\varepsilon/2 three ways" estimate is the standard proof that totally bounded is a closed condition, and it is exactly why finite-rank operators, when they converge in norm, produce compact limits.

Because K(X) is a closed, proper two-sided ideal of the Banach algebra B(X), the quotient B(X) / K(X) is itself a Banach algebra — the Calkin algebra — in which "compact" becomes "zero". Working modulo compact operators is the algebraic engine behind Fredholm theory: an operator is Fredholm exactly when it is invertible in the Calkin algebra, i.e. invertible up to a compact error.

The two canonical examples

Two families of operators appear on every functional-analysis exam, and between them they capture the whole flavour of compactness.

1. Diagonal (multiplier) operators on \ell^2

Fix a bounded sequence (\lambda_n) and define, on the sequence space \ell^2,

T(x_1, x_2, x_3, \dots) = (\lambda_1 x_1,\, \lambda_2 x_2,\, \lambda_3 x_3, \dots).

This is the infinite-dimensional diagonal matrix. It is bounded with operator norm \|T\| = \sup_n |\lambda_n| (the eigenvalues are the \lambda_n, with eigenvectors the standard basis e_n). Its compactness is decided by a single clean criterion:

The diagonal operator T on \ell^2 is compact if and only if \lambda_n \to 0.

If \lambda_n \to 0, T is compact. Truncate: let T_N keep only the first N entries (\lambda_1, \dots, \lambda_N, 0, 0, \dots). Each T_N has finite rank N, hence is compact, and

\|T - T_N\| = \sup_{n > N} |\lambda_n| \longrightarrow 0 \quad \text{as } N \to \infty,

exactly because \lambda_n \to 0. So T is a norm-limit of finite-rank operators, and by the closed-ideal theorem it is compact.

If \lambda_n \not\to 0, T is not compact. Then some \delta > 0 is exceeded infinitely often: |\lambda_{n_k}| \ge \delta. The unit vectors e_{n_k} are bounded, but \|T e_{n_k} - T e_{n_j}\|^2 = |\lambda_{n_k}|^2 + |\lambda_{n_j}|^2 \ge 2\delta^2 for k \ne j (orthogonality), so (T e_{n_k}) has no convergent subsequence — precisely the identity's failure again. In particular the identity on \ell^2 is the case \lambda_n \equiv 1, which emphatically does not tend to 0.

2. Integral operators with a continuous kernel

Back to the operator we started with: on C[a, b] (or L^2[a, b]), a continuous kernel K(x, t) on the square [a, b]^2 defines

(Tf)(x) = \int_a^b K(x, t)\, f(t)\, dt.

Such a T is compact. The reason is the Arzelà–Ascoli theorem: as f ranges over the unit ball, the images Tf are uniformly bounded and equicontinuous (continuity of K on a compact square is uniform, which controls |(Tf)(x) - (Tf)(x')| independently of f), and a uniformly bounded equicontinuous family is relatively compact in the sup norm. On L^2 these are the Hilbert–Schmidt operators, compact whenever \iint |K(x,t)|^2\,dx\,dt < \infty; their Hilbert–Schmidt norm is \|T\|_{HS} = \big(\iint |K|^2\big)^{1/2}, which for a diagonal operator is just \big(\sum_n |\lambda_n|^2\big)^{1/2}. This is the concrete reason the integral equations of physics are so tractable: their operators are compact, so the whole eigenvalue theory applies.

See it: the tail must collapse to zero

Here is compactness for a diagonal operator, made visible. Each bar is one eigenvalue \lambda_n = n^{-p} of the operator T = \operatorname{diag}(\lambda_1, \lambda_2, \dots) on \ell^2, standing at its index n. Think of \lambda_n as "how much T stretches the n-th coordinate axis": the picture is really a portrait of what T does to the unit ball, axis by axis.

Drag the decay rate p. At p = 0 every bar sits at height 1 — this is the identity, which stretches every axis equally and squashes nothing; infinitely many bars stay above any threshold, and T is not compact. Push p > 0 and the tail collapses toward 0: only finitely many coordinates are stretched by more than \varepsilon (highlighted), all the rest are crushed flat, and the image of the unit ball becomes relatively compact — T is compact. Slide the threshold \varepsilon and watch the count of "surviving" coordinates: it is always finite once p > 0, however small you make \varepsilon — the operational meaning of "almost finite-dimensional".

The pay-off to come: spectral theory

Why insist on this class at all? Because compactness is exactly the hypothesis that resurrects eigenvalues in infinite dimensions. A general bounded operator can have a spectrum that is a fat continuum with no eigenvectors in sight (the shift operator, multiplication operators). Compact operators cannot misbehave like that — their spectrum is as discrete and matrix-like as you could hope.

The crowning theorem — the spectral theorem for compact self-adjoint operators, which the next lessons build toward — says this. If T is a compact self-adjoint operator on an infinite-dimensional Hilbert space H, then:

its non-zero spectrum consists entirely of eigenvalues, each of finite multiplicity — just like a symmetric matrix;
these real eigenvalues \lambda_1, \lambda_2, \dots form a sequence whose only possible accumulation point is 0 (so for every \varepsilon > 0, only finitely many |\lambda_n| \ge \varepsilon);
H has an orthonormal basis of eigenvectors of T, so T = \sum_n \lambda_n \langle \cdot, e_n\rangle e_n is literally an infinite diagonal matrix.

You have already met the whole picture: the diagonal example above is the general case in disguise, and the interactive figure's "eigenvalues collapsing to 0" is the accumulation-at-0 statement drawn out. This is what makes integral equations solvable by eigenfunction expansions — Fourier series, Sturm–Liouville theory, the modes of a vibrating drum — all of them are the spectral theorem for a compact (inverse) operator, cashing in the fact that the operator is "almost finite-dimensional".