The Dual Space

Here is a shift in viewpoint that runs through all of modern analysis. Until now you have studied a vector space by looking at its vectors — adding them, scaling them, measuring their lengths. The dual viewpoint studies the space by looking at all the ways of measuring a vector: every linear "reading" you could take of it. Collect all those readings into a space of their own, and that new space — the dual — turns out to encode the original so faithfully that whole theorems are proved by moving back and forth between a space and its dual.

A concrete picture first. A shop sells n goods; a purchase is a vector x = (x_1, \dots, x_n) of quantities. Fix a price list p = (p_1, \dots, p_n). The total cost of the basket is

f(x) = p_1 x_1 + p_2 x_2 + \cdots + p_n x_n = \langle p, x\rangle.

This cost rule f is a machine that eats a vector and returns a single number. It is linear — doubling the basket doubles the bill, and the cost of two baskets combined is the sum of their costs — and it is the archetype of a functional. The price list is not itself a basket you can buy; it lives in a different world, the world of measuring devices. That world is the dual space, and this page is about what lives there.

Linear functionals and the dual space

Let X be a normed vector space over the scalar field \mathbb{K} (read \mathbb{R} or \mathbb{C} throughout). A linear functional is a linear map f : X \to \mathbb{K} — the codomain is the scalars, not another vector space. It is one particular bounded linear operator, the case where the target is the humble one-dimensional space \mathbb{K} itself.

As with any operator, linearity alone is too weak in infinite dimensions: we insist on continuity, which for a linear map is the same thing as boundedness. A functional f is bounded if it does not blow up faster than the length of its input — there is a constant M with

|f(x)| \le M\,\|x\| \qquad \text{for every } x \in X.

Geometrically that says the numbers f reads off a bounded set stay bounded. The smallest such M is the norm of f, the operator norm inherited from the theory of bounded operators:

\|f\| \;=\; \sup_{x \neq 0} \frac{|f(x)|}{\|x\|} \;=\; \sup_{\|x\| = 1} |f(x)| \;=\; \sup_{\|x\| \le 1} |f(x)|.

The three suprema agree by homogeneity: scaling x scales numerator and denominator together, so the ratio only depends on the direction of x, and you may as well test unit vectors. The value \|f\| is the largest reading the device f can produce from a unit input, and it delivers the working inequality |f(x)| \le \|f\|\,\|x\| — the one estimate you use again and again.

For a normed space X, the (continuous) dual space is the set of all bounded linear functionals on X,

X^* \;=\; B(X, \mathbb{K}) \;=\; \{\, f : X \to \mathbb{K} \mid f \text{ linear and bounded}\,\},

That last bullet deserves emphasis because it is a small miracle. The domain X can be full of holes — merely a normed space, not complete — and yet X^* never is. The reason is that completeness of X^* is inherited from the target, and the target \mathbb{K} is complete. A Cauchy sequence of functionals (f_n) is Cauchy at every point (because |f_n(x) - f_m(x)| \le \|f_n - f_m\|\,\|x\|), so it has a pointwise limit f(x) = \lim_n f_n(x); a short argument shows this f is linear, bounded, and the limit in norm. This is exactly the general fact that B(X, Y) is a Banach space whenever Y is — applied with Y = \mathbb{K}.

What a functional looks like: parallel hyperplanes

A functional is best seen, not just computed. Take X = \mathbb{R}^2 and a nonzero vector a = (a_1, a_2), and let f(x) = \langle a, x\rangle = a_1 x_1 + a_2 x_2. For each constant c, the level set

\{\, x : f(x) = c \,\} \;=\; \{\, x : \langle a, x\rangle = c\,\}

is a straight line, and as c ranges over the reals these lines sweep out a family of parallel lines filling the plane. Two features carry over to every dimension (where the level sets are parallel hyperplanes) and to every functional:

Drag the sliders below. \theta rotates the gradient a (and with it the whole family of lines), and \|a\| — which equals \|f\| — stretches a, tightening or loosening the packing. Watch the arrow stay square to the lines, and the integer level lines crowd together as the norm grows.

The dual of a finite-dimensional space is a copy of itself

On \mathbb{R}^n there are no unbounded linear functionals — in finite dimensions linearity forces continuity — so the continuous dual is the whole algebraic dual, and it has a wonderfully clean description.

Every linear functional f : \mathbb{R}^n \to \mathbb{R} is inner product against a fixed vector: there is a unique a \in \mathbb{R}^n with

f(x) = \langle a, x\rangle = \sum_{i=1}^{n} a_i x_i, \qquad \text{and} \qquad \|f\| = \|a\|_2 = \sqrt{\textstyle\sum_i a_i^2}.

The vector is read off the standard basis: a_i = f(e_i), since f(x) = f(\sum_i x_i e_i) = \sum_i x_i f(e_i) by linearity. That the operator norm equals the Euclidean length of a is a two-line squeeze. Cauchy–Schwarz gives the upper bound |f(x)| = |\langle a, x\rangle| \le \|a\|_2\,\|x\|_2, so \|f\| \le \|a\|_2; and testing the single unit vector x = a/\|a\|_2 makes it an equality, f(x) = \langle a, a\rangle / \|a\|_2 = \|a\|_2. So \|f\| = \|a\|_2 exactly.

The map a \mapsto \langle a, \cdot\rangle is thus a norm-preserving isomorphism (\mathbb{R}^n)^* \cong \mathbb{R}^n. In finite dimensions the dual is just a mirror image of the space. The subtleties of duality — where X^* can be genuinely bigger, smaller, or differently shaped than X — are an infinite-dimensional phenomenon, and to those we now turn.

The duality of the sequence spaces \ell^p

The first infinite-dimensional example, and the one every course drills, is the family of sequence spaces. For 1 \le p < \infty,

\ell^p = \Big\{\, x = (x_1, x_2, \dots) : \|x\|_p = \Big(\sum_{n=1}^\infty |x_n|^p\Big)^{1/p} < \infty \,\Big\},

and \ell^\infty is the bounded sequences with \|x\|_\infty = \sup_n |x_n|. The pairing that generates functionals is the natural dot product of two sequences, and the inequality that controls it is Hölder's inequality.

Let 1 < p < \infty and let q be its conjugate exponent, defined by

\frac{1}{p} + \frac{1}{q} = 1 \qquad\Longleftrightarrow\qquad q = \frac{p}{p-1}.

Then for x \in \ell^p and y \in \ell^q the paired sum converges absolutely and

\Big|\sum_{n=1}^\infty x_n y_n\Big| \;\le\; \sum_{n=1}^\infty |x_n y_n| \;\le\; \|x\|_p\,\|y\|_q.

Read this as a statement about functionals. Fix y \in \ell^q and define f_y(x) = \sum_n x_n y_n. Hölder says exactly that f_y is a bounded functional on \ell^p with \|f_y\| \le \|y\|_q. The deep theorem is that this accounts for all of them, and with equality:

For 1 \le p < \infty with conjugate exponent q (so q = \infty when p = 1), the map y \mapsto f_y is an isometric isomorphism

(\ell^p)^* \;\cong\; \ell^q, \qquad \|f_y\| = \|y\|_q.

Every bounded functional on \ell^p is "dot with a fixed \ell^q sequence", and its norm is the \ell^q norm of that sequence.

So (\ell^2)^* \cong \ell^2 (self-dual, as q = 2 when p = 2), (\ell^1)^* \cong \ell^\infty, and (\ell^{3/2})^* \cong \ell^3. The exponent flips to its conjugate; the pairing is always the same honest sum. Note the asymmetry already lurking: the theorem covers p < \infty, so it tells us (\ell^1)^* \cong \ell^\infty but says nothing yet about (\ell^\infty)^* — and that gap is where reflexivity will break.

The clean special case: Riesz representation in a Hilbert space

When the norm comes from an inner product — a Hilbert space H — the finite-dimensional picture returns in full, verbatim, in infinite dimensions. This is the Riesz representation theorem, one of the cornerstones you will revisit in depth, stated here as the guiding flavour of duality.

Let H be a Hilbert space. For every bounded functional f \in H^* there is a unique vector y \in H such that

f(x) = \langle x, y\rangle \quad \text{for all } x \in H, \qquad \text{and} \qquad \|f\| = \|y\|.

The map f \mapsto y identifies H^* with H: a Hilbert space is its own dual.

Since \ell^2 is a Hilbert space with \langle x, y\rangle = \sum_n x_n \overline{y_n}, this recovers (\ell^2)^* \cong \ell^2 as a special case — the self-duality above is Riesz in disguise. The moral to carry forward: inner-product geometry makes measuring devices indistinguishable from vectors. Every functional is "project onto a fixed direction", and its norm is the length of that direction. Riesz is the reason Hilbert spaces are so much friendlier than general Banach spaces, and it is the engine behind adjoints, orthogonal projections, and weak formulations of differential equations.

In finite dimensions every linear functional is automatically bounded, so it is tempting to think "bounded" is a formality. It is not. On an infinite-dimensional normed space there really are linear maps X \to \mathbb{K} with no finite M — take a Hamel basis and send its (infinitely many, normalized) basis vectors to 1, 2, 3, \dots: perfectly linear, wildly unbounded. Such a monster is discontinuous, and it is deliberately excluded from X^*.

The finiteness of \|f\| is therefore a genuine hypothesis, and it is precisely what makes X^* a normed space we can do analysis on. The payoff is that the operator norm turns the bare set of continuous functionals into a complete normed space — a Banach space — on which we can take limits, form series of functionals, and prove theorems. Boundedness is not red tape; it is the price of admission to the geometry.

The double dual and reflexivity

Since X^* is itself a Banach space, it too has a dual — the double dual X^{**} = (X^*)^*, the bounded functionals on the functionals. This is less abstract than it sounds, because there is a completely canonical way to turn a vector into a functional-of-functionals: let the vector evaluate the devices pointed at it.

For x \in X define \hat{x} : X^* \to \mathbb{K} by

\hat{x}(f) = f(x) \qquad (f \in X^*).

Reading it aloud: "\hat x hands each functional f the number that f assigns to x." This \hat x is linear in f, and bounded, since |\hat x(f)| = |f(x)| \le \|f\|\,\|x\| shows \|\hat x\| \le \|x\|. In fact — this needs the Hahn–Banach theorem, which supplies a norming functional — the inequality is an equality:

The map J : X \to X^{**}, J(x) = \hat{x} with \hat{x}(f) = f(x), is a linear isometry:

\|\hat{x}\|_{X^{**}} = \|x\|_X \quad \text{for all } x \in X.

So X embeds, norm and all, as a subspace of X^{**}: X \hookrightarrow X^{**}.

The embedding J is always injective and always isometric — but it need not be onto. When it is, when J(X) = X^{**} and the space "sees all of its double dual", we call X reflexive.

Reflexivity is one of the most useful good-behaviour hypotheses in the subject: in a reflexive space bounded sequences have weakly convergent subsequences, which is exactly what you need to extract minimizers in the calculus of variations and solutions of partial differential equations. The non-reflexive spaces are precisely where those existence arguments have to work harder.

The relation (\ell^p)^* \cong \ell^q is not symmetric at the endpoints, and this is the classic trap. It is true that (\ell^1)^* \cong \ell^\infty. But you may not conclude (\ell^\infty)^* \cong \ell^1. The dual of \ell^\infty is strictly bigger than \ell^1: it contains exotic functionals (built with Hahn–Banach from "limits along an ultrafilter", the Banach limits) that are not given by dotting with any summable sequence. Chase the double dual and you see reflexivity fail: (\ell^1)^{**} \cong (\ell^\infty)^*, which is properly larger than \ell^1 — so the canonical J for \ell^1 is not onto.

A second, subtler confusion sits underneath: the algebraic dual (all linear functionals) versus the continuous dual X^* (only the bounded ones). In finite dimensions they coincide, which lulls you into treating them as the same object. In infinite dimensions the algebraic dual is enormous — strictly larger than X^* — and none of the norm theory applies to it. In functional analysis "the dual" always means the continuous dual X^* = B(X, \mathbb{K}). Whenever a "dual space" statement smells wrong, check first that everyone is talking about bounded functionals.

The pattern, in one table

Duality is a machine with a small number of moving parts. Everything above is a special case of "find all the bounded ways to read a vector":

Keep the geometric picture underneath the algebra: a functional is a family of parallel hyperplanes, its norm is how tightly they pack, and the dual space is the space of all such families. The rest of functional analysis — Hahn–Banach, weak topologies, adjoints — is written in this language.