The Dual Space
Here is a shift in viewpoint that runs through all of modern analysis. Until now you have studied a
vector space by looking at its vectors — adding them, scaling them, measuring their lengths.
The dual viewpoint studies the space by looking at all the ways of measuring a
vector: every linear "reading" you could take of it. Collect all those readings into a space of their
own, and that new space — the dual — turns out to encode the original so faithfully
that whole theorems are proved by moving back and forth between a space and its dual.
A concrete picture first. A shop sells n goods; a purchase is a vector
x = (x_1, \dots, x_n) of quantities. Fix a price list
p = (p_1, \dots, p_n). The total cost of the basket is
f(x) = p_1 x_1 + p_2 x_2 + \cdots + p_n x_n = \langle p, x\rangle.
This cost rule f is a machine that eats a vector and returns a single
number. It is linear — doubling the basket doubles the bill, and the cost of two
baskets combined is the sum of their costs — and it is the archetype of a functional. The
price list is not itself a basket you can buy; it lives in a different world, the world of
measuring devices. That world is the dual space, and this page is about what lives there.
Linear functionals and the dual space
Let X be a normed vector space over the scalar field
\mathbb{K} (read \mathbb{R} or
\mathbb{C} throughout). A linear functional is a linear
map f : X \to \mathbb{K} — the codomain is the scalars, not another
vector space. It is one particular
bounded linear operator,
the case where the target is the humble one-dimensional space
\mathbb{K} itself.
As with any operator, linearity alone is too weak in infinite dimensions: we insist on
continuity, which for a linear map is the same thing as
boundedness. A functional f is bounded if it
does not blow up faster than the length of its input — there is a constant
M with
|f(x)| \le M\,\|x\| \qquad \text{for every } x \in X.
Geometrically that says the numbers f reads off a bounded set stay
bounded. The smallest such M is the norm of
f, the operator norm inherited from the theory of bounded operators:
\|f\| \;=\; \sup_{x \neq 0} \frac{|f(x)|}{\|x\|} \;=\; \sup_{\|x\| = 1} |f(x)| \;=\; \sup_{\|x\| \le 1} |f(x)|.
The three suprema agree by homogeneity: scaling x scales numerator and
denominator together, so the ratio only depends on the direction of x,
and you may as well test unit vectors. The value \|f\| is the
largest reading the device f can produce from a unit input, and
it delivers the working inequality
|f(x)| \le \|f\|\,\|x\| — the one estimate you use again and again.
For a normed space X, the (continuous) dual space is
the set of all bounded linear functionals on X,
X^* \;=\; B(X, \mathbb{K}) \;=\; \{\, f : X \to \mathbb{K} \mid f \text{ linear and bounded}\,\},
-
made a vector space by (f + g)(x) = f(x) + g(x) and
(\lambda f)(x) = \lambda\, f(x);
-
normed by the operator norm
\|f\| = \sup_{\|x\|=1} |f(x)|;
-
always a Banach space — complete under this norm — whether or not
X is complete.
That last bullet deserves emphasis because it is a small miracle. The domain
X can be full of holes — merely a normed space, not complete — and yet
X^* never is. The reason is that completeness of
X^* is inherited from the target, and the target
\mathbb{K} is complete. A Cauchy sequence of functionals
(f_n) is Cauchy at every point (because
|f_n(x) - f_m(x)| \le \|f_n - f_m\|\,\|x\|), so it has a pointwise limit
f(x) = \lim_n f_n(x); a short argument shows this
f is linear, bounded, and the limit in norm. This is exactly the
general fact that B(X, Y) is a Banach space whenever
Y is — applied with Y = \mathbb{K}.
What a functional looks like: parallel hyperplanes
A functional is best seen, not just computed. Take
X = \mathbb{R}^2 and a nonzero vector
a = (a_1, a_2), and let
f(x) = \langle a, x\rangle = a_1 x_1 + a_2 x_2. For each constant
c, the level set
\{\, x : f(x) = c \,\} \;=\; \{\, x : \langle a, x\rangle = c\,\}
is a straight line, and as c ranges over the reals these lines sweep out
a family of parallel lines filling the plane. Two features carry over to every
dimension (where the level sets are parallel hyperplanes) and to every functional:
-
Direction. The vector a — the "gradient" of
f — is perpendicular to every level line, because moving
along a level line keeps \langle a, x\rangle constant, so the
displacement is orthogonal to a. The kernel
\ker f = \{x : f(x)=0\} is the level line through the origin.
-
Spacing encodes the norm. Step from the line
f = c to the line f = c+1 along the
direction of a: the distance you travel is
1/\|a\|. Since here
\|f\| = \|a\|_2 (proved below), a large norm means the level
lines are packed tightly — the functional changes value quickly as you cross them — while
a small norm spreads them apart. The norm of a functional is literally the reciprocal spacing of
its unit-spaced level sets.
Drag the sliders below. \theta rotates the gradient
a (and with it the whole family of lines), and
\|a\| — which equals \|f\| — stretches
a, tightening or loosening the packing. Watch the arrow stay square to the
lines, and the integer level lines crowd together as the norm grows.
The dual of a finite-dimensional space is a copy of itself
On \mathbb{R}^n there are no unbounded linear functionals — in
finite dimensions linearity forces continuity — so the continuous dual is the whole algebraic dual,
and it has a wonderfully clean description.
Every linear functional f : \mathbb{R}^n \to \mathbb{R} is
inner product against a fixed vector: there is a unique
a \in \mathbb{R}^n with
f(x) = \langle a, x\rangle = \sum_{i=1}^{n} a_i x_i, \qquad \text{and} \qquad \|f\| = \|a\|_2 = \sqrt{\textstyle\sum_i a_i^2}.
The vector is read off the standard basis: a_i = f(e_i), since
f(x) = f(\sum_i x_i e_i) = \sum_i x_i f(e_i) by linearity. That the
operator norm equals the Euclidean length of a is a two-line squeeze.
Cauchy–Schwarz gives the upper bound
|f(x)| = |\langle a, x\rangle| \le \|a\|_2\,\|x\|_2, so
\|f\| \le \|a\|_2; and testing the single unit vector
x = a/\|a\|_2 makes it an equality,
f(x) = \langle a, a\rangle / \|a\|_2 = \|a\|_2. So
\|f\| = \|a\|_2 exactly.
The map a \mapsto \langle a, \cdot\rangle is thus a norm-preserving
isomorphism (\mathbb{R}^n)^* \cong \mathbb{R}^n. In finite dimensions
the dual is just a mirror image of the space. The subtleties of duality — where
X^* can be genuinely bigger, smaller, or differently shaped than
X — are an infinite-dimensional phenomenon, and to those we now turn.
The duality of the sequence spaces \ell^p
The first infinite-dimensional example, and the one every course drills, is the family of
sequence spaces. For 1 \le p < \infty,
\ell^p = \Big\{\, x = (x_1, x_2, \dots) : \|x\|_p = \Big(\sum_{n=1}^\infty |x_n|^p\Big)^{1/p} < \infty \,\Big\},
and \ell^\infty is the bounded sequences with
\|x\|_\infty = \sup_n |x_n|. The pairing that generates functionals is
the natural dot product of two sequences, and the inequality that controls it is
Hölder's inequality.
Let 1 < p < \infty and let q be its
conjugate exponent, defined by
\frac{1}{p} + \frac{1}{q} = 1 \qquad\Longleftrightarrow\qquad q = \frac{p}{p-1}.
Then for x \in \ell^p and y \in \ell^q the
paired sum converges absolutely and
\Big|\sum_{n=1}^\infty x_n y_n\Big| \;\le\; \sum_{n=1}^\infty |x_n y_n| \;\le\; \|x\|_p\,\|y\|_q.
Read this as a statement about functionals. Fix
y \in \ell^q and define
f_y(x) = \sum_n x_n y_n. Hölder says exactly that
f_y is a bounded functional on
\ell^p with \|f_y\| \le \|y\|_q. The deep
theorem is that this accounts for all of them, and with equality:
For 1 \le p < \infty with conjugate exponent
q (so q = \infty when
p = 1), the map
y \mapsto f_y is an isometric isomorphism
(\ell^p)^* \;\cong\; \ell^q, \qquad \|f_y\| = \|y\|_q.
Every bounded functional on \ell^p is
"dot with a fixed \ell^q sequence", and its norm is the
\ell^q norm of that sequence.
So (\ell^2)^* \cong \ell^2 (self-dual, as
q = 2 when p = 2),
(\ell^1)^* \cong \ell^\infty, and
(\ell^{3/2})^* \cong \ell^3. The exponent flips to its conjugate; the
pairing is always the same honest sum. Note the asymmetry already lurking: the theorem covers
p < \infty, so it tells us (\ell^1)^* \cong \ell^\infty
but says nothing yet about (\ell^\infty)^* — and that gap is where
reflexivity will break.
The clean special case: Riesz representation in a Hilbert space
When the norm comes from an inner product — a Hilbert space
H — the finite-dimensional picture returns in full, verbatim, in infinite
dimensions. This is the Riesz representation theorem, one of the cornerstones you
will revisit in depth, stated here as the guiding flavour of duality.
Let H be a Hilbert space. For every bounded functional
f \in H^* there is a unique vector
y \in H such that
f(x) = \langle x, y\rangle \quad \text{for all } x \in H, \qquad \text{and} \qquad \|f\| = \|y\|.
The map f \mapsto y identifies
H^* with H: a Hilbert space is its
own dual.
Since \ell^2 is a Hilbert space with
\langle x, y\rangle = \sum_n x_n \overline{y_n}, this recovers
(\ell^2)^* \cong \ell^2 as a special case — the self-duality above is
Riesz in disguise. The moral to carry forward: inner-product geometry makes measuring devices
indistinguishable from vectors. Every functional is "project onto a fixed direction", and its
norm is the length of that direction. Riesz is the reason Hilbert spaces are so much friendlier than
general Banach spaces, and it is the engine behind adjoints, orthogonal projections, and weak
formulations of differential equations.
In finite dimensions every linear functional is automatically bounded, so it is tempting
to think "bounded" is a formality. It is not. On an infinite-dimensional normed space there really
are linear maps X \to \mathbb{K} with no finite
M — take a Hamel basis and send its (infinitely many, normalized) basis
vectors to 1, 2, 3, \dots: perfectly linear, wildly unbounded. Such a
monster is discontinuous, and it is deliberately excluded from
X^*.
The finiteness of \|f\| is therefore a genuine hypothesis, and it is
precisely what makes X^* a normed space we can do analysis on. The
payoff is that the operator norm turns the bare set of continuous functionals into a
complete normed space — a Banach space — on which we can take limits, form series of functionals,
and prove theorems. Boundedness is not red tape; it is the price of admission to the geometry.
The double dual and reflexivity
Since X^* is itself a Banach space, it too has a dual — the
double dual X^{**} = (X^*)^*, the bounded functionals on
the functionals. This is less abstract than it sounds, because there is a completely canonical way to
turn a vector into a functional-of-functionals: let the vector evaluate the devices pointed at
it.
For x \in X define
\hat{x} : X^* \to \mathbb{K} by
\hat{x}(f) = f(x) \qquad (f \in X^*).
Reading it aloud: "\hat x hands each functional
f the number that f assigns to
x." This \hat x is linear in
f, and bounded, since
|\hat x(f)| = |f(x)| \le \|f\|\,\|x\| shows
\|\hat x\| \le \|x\|. In fact — this needs the Hahn–Banach theorem, which
supplies a norming functional — the inequality is an equality:
The map J : X \to X^{**}, J(x) = \hat{x} with
\hat{x}(f) = f(x), is a linear isometry:
\|\hat{x}\|_{X^{**}} = \|x\|_X \quad \text{for all } x \in X.
So X embeds, norm and all, as a subspace of
X^{**}: X \hookrightarrow X^{**}.
The embedding J is always injective and always isometric — but it need
not be onto. When it is, when J(X) = X^{**} and the space "sees
all of its double dual", we call X reflexive.
-
X is reflexive when the canonical embedding
J : X \to X^{**} is surjective (hence an isometric isomorphism).
-
\ell^p is reflexive for
1 < p < \infty: applying duality twice gives
(\ell^p)^{**} \cong (\ell^q)^* \cong \ell^p, and one checks the
round trip is exactly J.
-
\ell^1, \ell^\infty, and
C[a,b] are not reflexive.
-
Every Hilbert space is reflexive (Riesz applied twice).
Reflexivity is one of the most useful good-behaviour hypotheses in the subject: in a reflexive space
bounded sequences have weakly convergent subsequences, which is exactly what you need to extract
minimizers in the calculus of variations and solutions of partial differential equations. The
non-reflexive spaces are precisely where those existence arguments have to work harder.
The relation (\ell^p)^* \cong \ell^q is not symmetric at the
endpoints, and this is the classic trap. It is true that
(\ell^1)^* \cong \ell^\infty. But you may
not conclude (\ell^\infty)^* \cong \ell^1. The dual
of \ell^\infty is strictly bigger than
\ell^1: it contains exotic functionals (built with Hahn–Banach from
"limits along an ultrafilter", the Banach limits) that are not given by dotting with any
summable sequence. Chase the double dual and you see reflexivity fail:
(\ell^1)^{**} \cong (\ell^\infty)^*, which is properly larger than
\ell^1 — so the canonical J for
\ell^1 is not onto.
A second, subtler confusion sits underneath: the algebraic dual (all linear
functionals) versus the continuous dual X^* (only the
bounded ones). In finite dimensions they coincide, which lulls you into treating them as the same
object. In infinite dimensions the algebraic dual is enormous — strictly larger than
X^* — and none of the norm theory applies to it. In functional
analysis "the dual" always means the continuous dual
X^* = B(X, \mathbb{K}). Whenever a "dual space" statement smells wrong,
check first that everyone is talking about bounded functionals.
The pattern, in one table
Duality is a machine with a small number of moving parts. Everything above is a special case of
"find all the bounded ways to read a vector":
- (\mathbb{R}^n)^* \cong \mathbb{R}^n — every functional is
\langle a, \cdot\rangle; self-dual, reflexive.
- (\ell^p)^* \cong \ell^q,
\tfrac1p + \tfrac1q = 1 — reflexive for
1 < p < \infty.
- (\ell^1)^* \cong \ell^\infty, but
(\ell^\infty)^* \supsetneq \ell^1 — not reflexive.
- H^* \cong H (Riesz) — every Hilbert space is self-dual and
reflexive.
- Always: X^* is Banach, and
X \hookrightarrow X^{**} isometrically via
\hat{x}(f) = f(x).
Keep the geometric picture underneath the algebra: a functional is a family of parallel hyperplanes,
its norm is how tightly they pack, and the dual space is the space of all such families. The rest of
functional analysis — Hahn–Banach, weak topologies, adjoints — is written in this language.