Bounded Linear Operators
Turn up the gain on a guitar amplifier and the sound gets louder — but only up to a point. Every
honest amplifier has a maximum gain: feed it a signal of size 1
and the output can never exceed some fixed factor M, no matter what the
signal looks like. That single number — the largest factor by which the box can ever magnify its
input — is exactly what a functional analyst calls the operator norm, and a device
that has such a finite ceiling is called bounded. An amplifier with no
ceiling would screech into runaway feedback at the faintest whisper; in mathematics we call that an
unbounded operator, and — surprisingly — they are everywhere, from the derivative
to the observables of quantum mechanics.
A linear operator T : X \to Y between
normed spaces
is a
linear map
— T(\alpha x + \beta y) = \alpha Tx + \beta Ty — and we care about one
question above all: by how much can T stretch a vector?
The ratio \|Tx\|_Y / \|x\|_X measures the amplification
T applies to x. If that ratio has a finite
supremum over all non-zero x, the operator is tame; if it can be pushed
arbitrarily large, the operator is wild. This one distinction — tame versus wild, bounded versus
unbounded — turns out to be the right generalisation of continuity to infinite dimensions,
and organising it is the whole business of this page.
The definition: a ceiling on amplification
Let X and Y be normed spaces over the same
field (\mathbb{R} or \mathbb{C}), with norms
\|\cdot\|_X and \|\cdot\|_Y. A linear operator
T : X \to Y is bounded if there is a constant
M \ge 0 such that
\|Tx\|_Y \;\le\; M\,\|x\|_X \qquad \text{for every } x \in X.
The inequality is uniform: one single M must work for all
x at once. There are many such M (if
M works, so does anything larger); the operator norm
of T is the smallest of them —
\|T\| \;=\; \sup_{x \ne 0} \frac{\|Tx\|_Y}{\|x\|_X} \;=\; \sup_{\|x\|_X = 1} \|Tx\|_Y \;=\; \sup_{\|x\|_X \le 1} \|Tx\|_Y.
The three suprema agree, and it is worth seeing why, because the argument is the little
engine that drives every operator-norm computation. By homogeneity,
\|T(x/\|x\|)\| = \|Tx\|/\|x\|, so scanning the ratio over all
x \ne 0 is the same as scanning \|Tx\| over the
unit sphere \|x\| = 1; and since
\|Tx\| only grows as x gets longer, enlarging
the search to the whole closed unit ball \|x\| \le 1
adds nothing. Geometrically, \|T\| is the radius of the smallest ball in
Y that contains the image of the unit ball of
X. Two consequences we will use constantly:
\|Tx\| \le \|T\|\,\|x\| \quad (\text{the defining bound, for every } x), \qquad \|T\| = 0 \iff T = 0.
The first is just the definition of \|T\| as a supremum, read backwards
— it is the workhorse estimate. The second says the operator norm really does behave like a norm: it
vanishes only for the zero operator. (We will see below that \|\cdot\| is
a genuine norm on the whole space of bounded operators.)
The central theorem: bounded = continuous
Here is the pivot of the entire subject. For an arbitrary function between metric spaces,
continuity and any kind of "size bound" are unrelated. But a linear map is so rigid — its behaviour
near 0 is copied, by scaling, to every point — that the two notions
collapse into one. This is the single most important fact about linear operators.
Let T : X \to Y be a linear map between normed
spaces. The following are equivalent:
- (i) T is bounded:
\|Tx\| \le M\|x\| for some M and all
x.
- (ii) T is continuous at every
point of X.
- (iii) T is continuous at the single
point 0.
- (iv) T is Lipschitz (hence
uniformly continuous): \|Tx - Ty\| \le M\|x - y\|.
Proof. We show
\text{(i)} \Rightarrow \text{(iv)} \Rightarrow \text{(ii)} \Rightarrow \text{(iii)} \Rightarrow \text{(i)},
closing the loop.
(i) \Rightarrow (iv). Linearity turns the bound at a
point into a bound on differences. For any x, y,
\|Tx - Ty\| = \|T(x - y)\| \le M\,\|x - y\|.
So T is Lipschitz with constant M — the same
gain that bounds sizes also bounds separations.
(iv) \Rightarrow (ii). Lipschitz maps are continuous:
given \varepsilon > 0, take \delta = \varepsilon / M
(any \delta if M = 0); then
\|x - y\| < \delta forces
\|Tx - Ty\| \le M\|x-y\| < \varepsilon, at every point
y.
(ii) \Rightarrow (iii). Continuity everywhere includes
continuity at 0 — nothing to prove.
(iii) \Rightarrow (i). This is the surprising direction:
continuity at the single point 0 forces a global size bound.
Since T0 = 0, apply the definition of continuity at
0 with \varepsilon = 1: there is a
\delta > 0 such that
\|z\| < \delta \;\;\Longrightarrow\;\; \|Tz\| < 1.
Now take any x \ne 0 and rescale it to land just inside that
ball. Put z = \dfrac{\delta}{2\|x\|}\,x, so that
\|z\| = \delta/2 < \delta, and therefore
\|Tz\| < 1. But by linearity
\|Tz\| = \dfrac{\delta}{2\|x\|}\,\|Tx\|, and rearranging gives
\|Tx\| \;<\; \frac{2}{\delta}\,\|x\|.
This holds for every x \ne 0, and trivially for
x = 0, so T is bounded with
M = 2/\delta (indeed \|T\| \le 2/\delta).
\blacksquare
The moral is worth carrying: for linear maps, "bounded" is simply the name analysts give to
"continuous". The word "bounded" is used because the condition is a uniform bound on the
amplification — not, as the next box warns, because the range is a bounded set.
The name is a classic trap. "Bounded" here modifies the amplification, not the image.
The identity operator I : X \to X is the tamest map
imaginable, with \|I\| = 1 — yet its range is all of
X, which (in any non-trivial space) is an unbounded set. In
fact the only operator whose range is a bounded subset of Y is
the zero operator: if Tx_0 \ne 0 then
T(n x_0) = n\,Tx_0 marches off to infinity.
So what is bounded? The image of the unit ball. A bounded operator maps
bounded sets to bounded sets — squeeze your inputs into a ball of radius
R and the outputs stay inside a ball of radius
\|T\|\,R. That is the honest picture behind the word: bounded
input gives bounded output, with a fixed exchange rate. Do not read "bounded
operator" as "operator with bounded range" — almost no operator has that property.
A gallery of operators
Definitions come alive on examples. In each case the game is the same: find the smallest
M that works, and if none does, exhibit inputs whose amplification runs
away.
-
The identity and scalings. \|I\| = 1; the scaling
x \mapsto \lambda x has norm |\lambda|. A
rotation of the plane is an isometry, so it too has operator norm 1 — it
moves vectors around without ever stretching them.
-
Diagonal operators. On (\mathbb{R}^n, \|\cdot\|_2),
the diagonal matrix \operatorname{diag}(\lambda_1, \dots, \lambda_n)
scales the i-th axis by \lambda_i. The
largest stretch of a unit vector is achieved by pointing entirely along the biggest entry, so
\|T\| = \max_i |\lambda_i|. The same formula, now a supremum,
gives the norm of a diagonal (multiplier) operator on the sequence space
\ell^2: \|T\| = \sup_i |\lambda_i|, which is
finite exactly when the multipliers are bounded.
-
Matrices in general (the operator norm is the largest singular value). Any
matrix A : (\mathbb{R}^n, \|\cdot\|_2) \to (\mathbb{R}^m, \|\cdot\|_2)
is bounded (finite dimensions leave no room for runaway). Its operator norm is
\|A\| = \sigma_{\max}(A), the largest singular value —
equivalently \sqrt{\lambda_{\max}(A^{\mathsf T}A)}. The unit sphere maps
to an ellipsoid, and \|A\| is the length of its longest semi-axis. This
is exactly the picture the interactive figure below lets you play with.
-
The shift operator on \ell^2. The right shift
S(x_1, x_2, x_3, \dots) = (0, x_1, x_2, \dots) just relabels the
coordinates, so \|Sx\|_2 = \|x\|_2 for every
x: it is an isometry, hence \|S\| = 1.
(It is injective but not surjective — a purely infinite-dimensional phenomenon; there is
no room for such a thing among square matrices.)
-
Integral (Fredholm) operators. Given a continuous kernel
K(x,t) on [a,b]^2, define on
(C[a,b], \|\cdot\|_\infty)
(Tf)(x) = \int_a^b K(x,t)\,f(t)\,dt.
Then |(Tf)(x)| \le \big(\int_a^b |K(x,t)|\,dt\big)\|f\|_\infty, and
taking the sup over x shows
\|T\| \le \max_x \int_a^b |K(x,t)|\,dt — a finite bound, so
T is bounded. Such smoothing operators are the workhorses of
differential equations, imaging, and signal processing (blurring a photo convolves it against a
kernel).
The star counterexample: differentiation is unbounded
Every operator above was bounded, which might tempt you to think boundedness is automatic. It is
not — and the most important operator in all of analysis, the derivative, is the
canonical villain. Work on C^1[0, 2\pi] \subset (C[0, 2\pi], \|\cdot\|_\infty)
with the differentiation operator D f = f', and test it on the family
f_n(x) = \sin(n x), \qquad n = 1, 2, 3, \dots
Each f_n is small in the sup norm — its graph oscillates between
-1 and 1, so
\|f_n\|_\infty = 1 for all n. But its
derivative f_n'(x) = n\cos(nx) oscillates between
-n and n, so
\|D f_n\|_\infty = n \quad\text{while}\quad \|f_n\|_\infty = 1, \qquad\text{hence}\qquad \frac{\|D f_n\|_\infty}{\|f_n\|_\infty} = n \longrightarrow \infty.
No single M can bound the ratio, so no
M exists and D is unbounded. The
intuition is physical: a wiggle that is tiny in amplitude but very rapid has a huge slope.
Differentiation is exquisitely sensitive to high-frequency detail, and there is no ceiling on how
much frequency you can pack into a function of height 1. By the central
theorem, "unbounded" is the same as "discontinuous": you can have
f_n \to 0 uniformly while f_n' does
not converge at all — differentiation does not respect uniform limits. (This is precisely
why analysis is so fussy about differentiating a series term by term, yet relaxed about integrating
one: integration, the inverse operation, is a bounded, smoothing operator.)
Bounded operators are the well-behaved ones, but nature seems to prefer the wild kind. In quantum
mechanics the position operator (\hat{x}\psi)(x) = x\,\psi(x) and the
momentum operator \hat{p} = -i\hbar\,\frac{d}{dx} are both
unbounded — \hat{p} because it is essentially
differentiation, \hat{x} because multiplying by
x can amplify a normalised wavefunction without limit. So are almost all
the Hamiltonians (energy operators) that govern real systems.
This is not a technical nuisance you can wish away. The Hellinger–Toeplitz theorem
says that a symmetric operator defined on all of a Hilbert space is automatically bounded.
Contrapositive: an unbounded observable like energy or momentum simply cannot be defined
on the whole space — it lives only on a dense domain of sufficiently nice states. The
entire delicate theory of self-adjoint operators, spectra, and domains that underpins quantum
mechanics exists precisely because the operators that matter are unbounded. Boundedness is the
exception, not the rule.
See it: the unit circle becomes an ellipse
For a 2\times 2 matrix
T = \begin{pmatrix} a & b \\ c & d \end{pmatrix} acting on
(\mathbb{R}^2, \|\cdot\|_2), the operator norm has a beautifully visual
meaning. The unit circle \|x\| = 1 (dashed) is carried by
T to an ellipse (solid). The operator norm
\|T\| = \sigma_{\max} is the length of the ellipse's longest
semi-axis — the direction in which T stretches the hardest — while the
shortest semi-axis is the smallest singular value \sigma_{\min}. Drag the
four matrix entries and watch \|T\| track the maximum stretch. Notice the
special cases: a pure rotation keeps the circle a circle
(\sigma_{\max} = \sigma_{\min} = 1); make the matrix singular
(ad = bc) and the ellipse collapses to a line segment, with
\sigma_{\min} = 0.
The space B(X, Y) of bounded operators
Collect all bounded linear operators from X to
Y into one set, written B(X, Y) (also
\mathcal{L}(X, Y)). Adding operators and scaling them pointwise —
(S + T)x = Sx + Tx, (\alpha T)x = \alpha(Tx) —
keeps you inside the set, so B(X, Y) is a vector space. Remarkably, the
operator norm makes it a normed space in its own right: a space whose points are
operators.
-
\|\cdot\| is a norm on B(X, Y):
it is non-negative and vanishes only for T = 0; it is homogeneous,
\|\alpha T\| = |\alpha|\,\|T\|; and it satisfies the triangle
inequality \|S + T\| \le \|S\| + \|T\|.
-
It is submultiplicative under composition:
\|ST\| \le \|S\|\,\|T\| whenever the composite makes sense.
-
If Y is complete (a Banach space), then so is
B(X, Y) — regardless of whether X is
complete. In particular the dual space
X^{*} = B(X, \mathbb{F}) of bounded linear functionals is
always a Banach space.
The triangle inequality is a one-liner from the defining bound: for a unit vector
x,
\|(S+T)x\| \le \|Sx\| + \|Tx\| \le \|S\| + \|T\|, and taking the sup over
\|x\| = 1 gives \|S + T\| \le \|S\| + \|T\|.
Submultiplicativity is the same trick chained:
\|STx\| \le \|S\|\,\|Tx\| \le \|S\|\,\|T\|\,\|x\|.
Why completeness of Y transfers. Suppose
(T_n) is Cauchy in B(X, Y). For each fixed
x, the defining bound
\|T_n x - T_m x\| \le \|T_n - T_m\|\,\|x\| shows
(T_n x) is Cauchy in Y; because
Y is complete, it converges to some limit we define to be
Tx. Linearity of T passes to the limit; a
short \varepsilon-argument then shows T is
bounded and that \|T_n - T\| \to 0. So the limit lives back inside
B(X, Y) — the space is complete. This is the reason the dual space, and
the whole edifice of Banach-space duality, gets off the ground: whatever X
is, its space of continuous linear functionals is always as well-behaved as the scalar field.
When X = Y, you can not only add and scale operators but also
compose them, and composition acts as a multiplication. So
B(X) = B(X, X) is a vector space with a product — an
algebra — and submultiplicativity
\|ST\| \le \|S\|\,\|T\| plus completeness (when
X is Banach) makes it a Banach algebra. This is the
gateway to C^{*}-algebras and the operator-algebraic formulation of
quantum theory. Note the product is generally non-commutative:
ST \ne TS in general, exactly as matrices fail to commute — and in
quantum mechanics that non-commutativity is the uncertainty principle.