Hilbert Spaces Revisited
You have already met Hilbert spaces
as inner-product spaces where lengths and angles make sense. Now we return to them from the
functional-analysis viewpoint, sitting one rung above
Banach spaces in the
hierarchy of spaces. The question that organises this whole page is sharp and worth holding in mind:
Of all the complete normed spaces, what makes a Hilbert space so much better behaved?
The one-word answer is geometry. A Banach space knows how long its vectors are; a
Hilbert space also knows the angle between any two of them, and in particular when two
vectors are at right angles. That single extra structure — an inner product — buys three theorems
that have no counterpart in a general Banach space, and those three theorems are the backbone of
least-squares fitting, Fourier analysis, quantum mechanics and signal processing.
A real-world hook to keep in view. A noisy signal or a data cloud is a vector x
in some enormous space; the "clean" signals you are willing to accept form a subspace
M. The engineer's daily task — find the best approximation of
x inside M — is, word for word, the
problem of dropping a perpendicular from a point to a plane. In a Hilbert space that perpendicular
always exists, is unique, and is computable coefficient by coefficient. That is the payoff, and the
rest of the page earns it.
The definition, and the norm behind it
Let H be a vector space over \mathbb{K}
(read \mathbb{R} or \mathbb{C}) carrying an
inner product
\langle \cdot, \cdot \rangle — linear in the first slot, conjugate-symmetric,
and positive-definite. Every inner product manufactures a norm,
\|x\| = \sqrt{\langle x, x\rangle},
and a norm manufactures the distance d(x, y) = \|x - y\|. So an
inner-product space is automatically a normed space,
hence a metric space. The last ingredient is completeness: no Cauchy sequence is
allowed to escape.
A Hilbert space is a complete inner-product space — an inner-product
space that is complete in the norm \|x\| = \sqrt{\langle x, x\rangle}.
- Equivalently: a Banach space whose norm comes from an inner product.
- Every Hilbert space is a Banach space; the converse fails (that is the point of this page).
- Model examples: \mathbb{R}^n and \mathbb{C}^n
with the dot product, the sequence space \ell^2, and the function space
L^2[a, b] with
\langle f, g\rangle = \int_a^b f\,\overline{g}\,dx.
This raises an obvious question: given a Banach space, can you tell whether its norm secretly came
from an inner product? Remarkably, yes — and by a single identity you can check with
two vectors.
Expand \|x \pm y\|^2 = \langle x \pm y, x \pm y\rangle = \|x\|^2 \pm 2\,\mathrm{Re}\langle x, y\rangle + \|y\|^2
and add the two versions. The cross terms cancel, leaving the parallelogram law:
\|x + y\|^2 + \|x - y\|^2 = 2\|x\|^2 + 2\|y\|^2.
Geometrically it says the two diagonals of a parallelogram determine its four sides — a fact that
holds for genuine Euclidean lengths and, it turns out, only for them. The
Jordan–von Neumann theorem makes this a perfect characterisation: a norm satisfies
the parallelogram law if and only if it arises from an inner product, in which case the
product is recovered from the norm by the polarisation identity (real case)
\langle x, y\rangle = \tfrac{1}{4}\big(\|x + y\|^2 - \|x - y\|^2\big).
So "is this Banach space a Hilbert space?" is settled by a one-line test on the norm. Fail the
parallelogram law at even a single pair (x, y), and no inner product can
exist.
It is tempting to imagine that once a space is complete and normed, all the familiar geometry comes
for free. It does not. The parallelogram law is a genuine restriction, and most of the everyday
Banach spaces flunk it.
The sup norm fails. Take \mathbb{R}^2 with
\|x\|_\infty = \max(|x_1|, |x_2|), and the vectors
x = (1, 0), y = (0, 1). Then
\|x + y\|_\infty = 1 and \|x - y\|_\infty = 1,
so the left side is 1 + 1 = 2, while the right side is
2(1) + 2(1) = 4. 2 \ne 4: no inner product can
reproduce the sup norm. The same collapse happens in C[0, 1] with its
uniform norm.
Only p = 2 survives among the \ell^p.
Repeat the calculation in \|\cdot\|_p with the same
x, y: the left side is 2^{2/p} + 2^{2/p} and the
right side is 4. These agree only when 2^{2/p} = 2,
i.e. p = 2. So of the entire family
\ell^1, \ell^2, \ell^3, \dots, \ell^\infty, exactly one —
\ell^2 — is a Hilbert space. \ell^2 and
L^2 are geometric islands in an ocean of merely-Banach spaces.
Theorem 1 — orthogonal projection: drop a perpendicular
This is the theorem the real-world hook was pointing at. In a general metric space there is no reason
a set should contain a nearest point to a given x — the infimum of
the distances need not be attained, and even if it is it might be attained twice. In a Hilbert space,
convexity plus completeness rescue both existence and uniqueness.
Let H be a Hilbert space.
-
Nearest point in a convex set. If K \subseteq H is
nonempty, closed and convex, then every x \in H has a
unique nearest point p \in K:
\|x - p\| = \inf_{k \in K}\|x - k\|.
-
Projection onto a closed subspace. If M \subseteq H is a
closed subspace, the nearest point p = P_M x is
characterised by the orthogonality condition
x - p \perp M: the residual is perpendicular to every vector of
M.
-
Orthogonal decomposition. Consequently
H = M \oplus M^{\perp}: every x splits
uniquely as x = p + r with
p \in M and r \in M^{\perp}, and
(M^{\perp})^{\perp} = M.
Why is the nearest point unique? Suppose p and q
both realise the minimal distance \delta. Their midpoint
\tfrac{1}{2}(p + q) lies in K by convexity, so it
is no closer than \delta either. Now feed
x - p and x - q into the parallelogram law:
\|p - q\|^2 = 2\|x - p\|^2 + 2\|x - q\|^2 - 4\Big\|x - \tfrac{p + q}{2}\Big\|^2 \le 2\delta^2 + 2\delta^2 - 4\delta^2 = 0.
So \|p - q\| = 0, i.e. p = q. Notice what did the
work: the parallelogram law — the very identity that separates Hilbert spaces from
Banach spaces. Uniqueness of best approximation is a Hilbert-space privilege, and it is exactly the
least-squares "normal equations" x - p \perp M in disguise.
Drag the sliders below. The blue line is a subspace M through the origin;
x is the point being approximated; p = P_M x is
its shadow on M; and the dashed residual x - p
always meets M at a right angle — that right angle is the reason
p is the closest point.
Both hypotheses are load-bearing, and dropping either one breaks the theorem.
-
Drop closed. In \mathbb{R} take the open interval
K = (0, 1) and the point x = 2. The distances
to points of K creep down toward 1 but never
reach it — the would-be nearest point 1 is missing from
K. No minimiser exists.
-
Drop convex. Take K to be a circle (not the disc) and
x its centre. Every point of the circle is equidistant, so the
nearest point is wildly non-unique. Convexity is what forbids these ties.
-
Drop completeness (work in an incomplete inner-product space) and the minimising
sequence can be Cauchy yet converge to nothing in the space — existence fails again. All three
Hilbert-space hypotheses pull their weight.
Theorem 2 — orthonormal bases and generalised Fourier series
Projection onto a line spanned by a unit vector e is a one-liner:
the shadow is \langle x, e\rangle\,e, and the scalar
\langle x, e\rangle is the coordinate of x along
e. An orthonormal system
\{e_n\} — mutually perpendicular unit vectors,
\langle e_m, e_n\rangle = \delta_{mn} — lets us do this in every direction
at once and assemble the projections into a series.
The numbers \hat{x}_n = \langle x, e_n\rangle are the
generalised Fourier coefficients of x. The partial sum
s_N = \sum_{n=1}^{N}\langle x, e_n\rangle\,e_n is precisely the orthogonal
projection of x onto \mathrm{span}\{e_1, \dots, e_N\}
— the best approximation of x by those N
directions, by Theorem 1. Because x - s_N \perp s_N, Pythagoras gives
\|x\|^2 = \|s_N\|^2 + \|x - s_N\|^2 \ge \|s_N\|^2 = \sum_{n=1}^N |\langle x, e_n\rangle|^2,
and letting N \to \infty yields Bessel's inequality.
Let \{e_n\} be an orthonormal system in a Hilbert space
H.
-
Bessel's inequality holds for every orthonormal system:
\displaystyle\sum_{n} |\langle x, e_n\rangle|^2 \le \|x\|^2. In
particular the coefficients are square-summable and
\langle x, e_n\rangle \to 0.
-
Orthonormal basis. The system is complete (an orthonormal basis) when its
closed span is all of H. Then
\displaystyle x = \sum_n \langle x, e_n\rangle\,e_n for every
x — the generalised Fourier series, convergent in norm.
-
Parseval's identity. For a basis, Bessel's inequality becomes an equality:
\displaystyle \|x\|^2 = \sum_n |\langle x, e_n\rangle|^2. "Energy in the
signal equals energy in the coefficients."
The classical Fourier series is the special case H = L^2[-\pi, \pi] with the
orthonormal system \big\{\tfrac{1}{\sqrt{2\pi}}e^{inx}\big\}_{n \in \mathbb{Z}}.
Parseval's identity then reads \int_{-\pi}^{\pi} |f|^2 = 2\pi\sum_n |c_n|^2 —
the statement that a function and its spectrum carry the same total energy, the mathematical heart of
every equaliser, MP3 encoder and spectrum analyser. Abstractly, Parseval says a separable Hilbert space
is isometrically the coordinate space \ell^2: choose a basis and
every vector becomes its sequence of coefficients.
Theorem 3 — Riesz representation: a Hilbert space is its own dual
A bounded linear functional is a continuous linear "measurement"
f : H \to \mathbb{K}; together they form the
dual space
H^{*}. In a Hilbert space every such measurement has a startlingly concrete
form: it is just taking the inner product with one fixed vector.
Let H be a Hilbert space and
f \in H^{*} a bounded linear functional. Then:
- there is a unique y \in H with
f(x) = \langle x, y\rangle for all x;
- the norms match exactly: \|f\|_{H^{*}} = \|y\|_{H};
- the map f \mapsto y is a (conjugate-linear) isometric
isomorphism H^{*} \cong H — a Hilbert space is
canonically its own dual.
The vector y is built by Theorem 1. If f = 0
take y = 0. Otherwise the kernel M = \ker f is a
closed subspace with M \ne H, so
M^{\perp} contains a unit vector e — that is where
completeness and the projection theorem enter — and one checks that
y = \overline{f(e)}\,e does the job. Uniqueness is immediate: if
\langle x, y_1\rangle = \langle x, y_2\rangle for all
x, put x = y_1 - y_2 to get
\|y_1 - y_2\|^2 = 0.
This is a striking contrast with the wider world of
Banach spaces, where the
dual is usually a different space (the dual of \ell^1 is
\ell^\infty, and the dual of \ell^\infty is bigger
still). Only in a Hilbert space does H^{*} fold back onto
H itself. Self-duality is what lets us define adjoint operators
\langle Tx, y\rangle = \langle x, T^{*}y\rangle, and it is the reason the
bra \langle \psi| and the ket
|\psi\rangle of quantum mechanics are two faces of the same state.
Frigyes Riesz (1880–1956) was one of the founders of functional analysis, and this 1907 result — proved
independently the same year by Maurice Fréchet — was among the first to show that abstract Hilbert
space had teeth. The magic is that an arbitrary continuous linear rule, however it is
described, must secretly be an inner product with a single hidden vector. Infinitely many possible
"measurements" collapse to a choice of one direction to measure along.
It also quietly powers the rest of analysis. The existence of weak solutions to
partial differential equations (the Lax–Milgram theorem) is Riesz representation wearing a coat; so is
the existence of conditional expectation in probability, which is literally an
orthogonal projection in L^2. Once you see a Hilbert space, you start seeing
"drop a perpendicular" everywhere.
The three theorems, in one breath
Everything above rests on the single fact that a Hilbert-space norm knows angles. From it:
-
Projection — closed convex sets have unique nearest points, and
H = M \oplus M^{\perp}. (Best approximation, least squares.)
-
Fourier series — x = \sum_n \langle x, e_n\rangle e_n with
Bessel's inequality \sum|\hat{x}_n|^2 \le \|x\|^2 and, for a basis,
Parseval \|x\|^2 = \sum|\hat{x}_n|^2. (Signal decomposition.)
-
Riesz — f(x) = \langle x, y\rangle,
\|f\| = \|y\|, so H^{*} \cong H. (Self-duality,
adjoints, quantum bra–ket.)
A Banach space gives you none of these for free. That gap — closed by the parallelogram law — is the
whole reason Hilbert spaces sit in a class of their own.