Bounded Linear Operators

Turn up the gain on a guitar amplifier and the sound gets louder — but only up to a point. Every honest amplifier has a maximum gain: feed it a signal of size 1 and the output can never exceed some fixed factor M, no matter what the signal looks like. That single number — the largest factor by which the box can ever magnify its input — is exactly what a functional analyst calls the operator norm, and a device that has such a finite ceiling is called bounded. An amplifier with no ceiling would screech into runaway feedback at the faintest whisper; in mathematics we call that an unbounded operator, and — surprisingly — they are everywhere, from the derivative to the observables of quantum mechanics.

A linear operator T : X \to Y between normed spaces is a linear map — T(\alpha x + \beta y) = \alpha Tx + \beta Ty — and we care about one question above all: by how much can T stretch a vector? The ratio \|Tx\|_Y / \|x\|_X measures the amplification T applies to x. If that ratio has a finite supremum over all non-zero x, the operator is tame; if it can be pushed arbitrarily large, the operator is wild. This one distinction — tame versus wild, bounded versus unbounded — turns out to be the right generalisation of continuity to infinite dimensions, and organising it is the whole business of this page.

The definition: a ceiling on amplification

Let X and Y be normed spaces over the same field (\mathbb{R} or \mathbb{C}), with norms \|\cdot\|_X and \|\cdot\|_Y. A linear operator T : X \to Y is bounded if there is a constant M \ge 0 such that

\|Tx\|_Y \;\le\; M\,\|x\|_X \qquad \text{for every } x \in X.

The inequality is uniform: one single M must work for all x at once. There are many such M (if M works, so does anything larger); the operator norm of T is the smallest of them —

\|T\| \;=\; \sup_{x \ne 0} \frac{\|Tx\|_Y}{\|x\|_X} \;=\; \sup_{\|x\|_X = 1} \|Tx\|_Y \;=\; \sup_{\|x\|_X \le 1} \|Tx\|_Y.

The three suprema agree, and it is worth seeing why, because the argument is the little engine that drives every operator-norm computation. By homogeneity, \|T(x/\|x\|)\| = \|Tx\|/\|x\|, so scanning the ratio over all x \ne 0 is the same as scanning \|Tx\| over the unit sphere \|x\| = 1; and since \|Tx\| only grows as x gets longer, enlarging the search to the whole closed unit ball \|x\| \le 1 adds nothing. Geometrically, \|T\| is the radius of the smallest ball in Y that contains the image of the unit ball of X. Two consequences we will use constantly:

\|Tx\| \le \|T\|\,\|x\| \quad (\text{the defining bound, for every } x), \qquad \|T\| = 0 \iff T = 0.

The first is just the definition of \|T\| as a supremum, read backwards — it is the workhorse estimate. The second says the operator norm really does behave like a norm: it vanishes only for the zero operator. (We will see below that \|\cdot\| is a genuine norm on the whole space of bounded operators.)

The central theorem: bounded = continuous

Here is the pivot of the entire subject. For an arbitrary function between metric spaces, continuity and any kind of "size bound" are unrelated. But a linear map is so rigid — its behaviour near 0 is copied, by scaling, to every point — that the two notions collapse into one. This is the single most important fact about linear operators.

Let T : X \to Y be a linear map between normed spaces. The following are equivalent:

(i) T is bounded: \|Tx\| \le M\|x\| for some M and all x.
(ii) T is continuous at every point of X.
(iii) T is continuous at the single point 0.
(iv) T is Lipschitz (hence uniformly continuous): \|Tx - Ty\| \le M\|x - y\|.

Proof. We show \text{(i)} \Rightarrow \text{(iv)} \Rightarrow \text{(ii)} \Rightarrow \text{(iii)} \Rightarrow \text{(i)}, closing the loop.

(i) \Rightarrow (iv). Linearity turns the bound at a point into a bound on differences. For any x, y,

\|Tx - Ty\| = \|T(x - y)\| \le M\,\|x - y\|.

So T is Lipschitz with constant M — the same gain that bounds sizes also bounds separations.

(iv) \Rightarrow (ii). Lipschitz maps are continuous: given \varepsilon > 0, take \delta = \varepsilon / M (any \delta if M = 0); then \|x - y\| < \delta forces \|Tx - Ty\| \le M\|x-y\| < \varepsilon, at every point y.

(ii) \Rightarrow (iii). Continuity everywhere includes continuity at 0 — nothing to prove.

(iii) \Rightarrow (i). This is the surprising direction: continuity at the single point 0 forces a global size bound. Since T0 = 0, apply the definition of continuity at 0 with \varepsilon = 1: there is a \delta > 0 such that

\|z\| < \delta \;\;\Longrightarrow\;\; \|Tz\| < 1.

Now take any x \ne 0 and rescale it to land just inside that ball. Put z = \dfrac{\delta}{2\|x\|}\,x, so that \|z\| = \delta/2 < \delta, and therefore \|Tz\| < 1. But by linearity \|Tz\| = \dfrac{\delta}{2\|x\|}\,\|Tx\|, and rearranging gives

\|Tx\| \;<\; \frac{2}{\delta}\,\|x\|.

This holds for every x \ne 0, and trivially for x = 0, so T is bounded with M = 2/\delta (indeed \|T\| \le 2/\delta). \blacksquare

The moral is worth carrying: for linear maps, "bounded" is simply the name analysts give to "continuous". The word "bounded" is used because the condition is a uniform bound on the amplification — not, as the next box warns, because the range is a bounded set.

The name is a classic trap. "Bounded" here modifies the amplification, not the image. The identity operator I : X \to X is the tamest map imaginable, with \|I\| = 1 — yet its range is all of X, which (in any non-trivial space) is an unbounded set. In fact the only operator whose range is a bounded subset of Y is the zero operator: if Tx_0 \ne 0 then T(n x_0) = n\,Tx_0 marches off to infinity.

So what is bounded? The image of the unit ball. A bounded operator maps bounded sets to bounded sets — squeeze your inputs into a ball of radius R and the outputs stay inside a ball of radius \|T\|\,R. That is the honest picture behind the word: bounded input gives bounded output, with a fixed exchange rate. Do not read "bounded operator" as "operator with bounded range" — almost no operator has that property.

A gallery of operators

Definitions come alive on examples. In each case the game is the same: find the smallest M that works, and if none does, exhibit inputs whose amplification runs away.

The identity and scalings. \|I\| = 1; the scaling x \mapsto \lambda x has norm |\lambda|. A rotation of the plane is an isometry, so it too has operator norm 1 — it moves vectors around without ever stretching them.
Diagonal operators. On (\mathbb{R}^n, \|\cdot\|_2), the diagonal matrix \operatorname{diag}(\lambda_1, \dots, \lambda_n) scales the i-th axis by \lambda_i. The largest stretch of a unit vector is achieved by pointing entirely along the biggest entry, so \|T\| = \max_i |\lambda_i|. The same formula, now a supremum, gives the norm of a diagonal (multiplier) operator on the sequence space \ell^2: \|T\| = \sup_i |\lambda_i|, which is finite exactly when the multipliers are bounded.
Matrices in general (the operator norm is the largest singular value). Any matrix A : (\mathbb{R}^n, \|\cdot\|_2) \to (\mathbb{R}^m, \|\cdot\|_2) is bounded (finite dimensions leave no room for runaway). Its operator norm is \|A\| = \sigma_{\max}(A), the largest singular value — equivalently \sqrt{\lambda_{\max}(A^{\mathsf T}A)}. The unit sphere maps to an ellipsoid, and \|A\| is the length of its longest semi-axis. This is exactly the picture the interactive figure below lets you play with.
The shift operator on \ell^2. The right shift S(x_1, x_2, x_3, \dots) = (0, x_1, x_2, \dots) just relabels the coordinates, so \|Sx\|_2 = \|x\|_2 for every x: it is an isometry, hence \|S\| = 1. (It is injective but not surjective — a purely infinite-dimensional phenomenon; there is no room for such a thing among square matrices.)
Integral (Fredholm) operators. Given a continuous kernel K(x,t) on [a,b]^2, define on (C[a,b], \|\cdot\|_\infty) (Tf)(x) = \int_a^b K(x,t)\,f(t)\,dt. Then |(Tf)(x)| \le \big(\int_a^b |K(x,t)|\,dt\big)\|f\|_\infty, and taking the sup over x shows \|T\| \le \max_x \int_a^b |K(x,t)|\,dt — a finite bound, so T is bounded. Such smoothing operators are the workhorses of differential equations, imaging, and signal processing (blurring a photo convolves it against a kernel).

The star counterexample: differentiation is unbounded

Every operator above was bounded, which might tempt you to think boundedness is automatic. It is not — and the most important operator in all of analysis, the derivative, is the canonical villain. Work on C^1[0, 2\pi] \subset (C[0, 2\pi], \|\cdot\|_\infty) with the differentiation operator D f = f', and test it on the family

f_n(x) = \sin(n x), \qquad n = 1, 2, 3, \dots

Each f_n is small in the sup norm — its graph oscillates between -1 and 1, so \|f_n\|_\infty = 1 for all n. But its derivative f_n'(x) = n\cos(nx) oscillates between -n and n, so

\|D f_n\|_\infty = n \quad\text{while}\quad \|f_n\|_\infty = 1, \qquad\text{hence}\qquad \frac{\|D f_n\|_\infty}{\|f_n\|_\infty} = n \longrightarrow \infty.

No single M can bound the ratio, so no M exists and D is unbounded. The intuition is physical: a wiggle that is tiny in amplitude but very rapid has a huge slope. Differentiation is exquisitely sensitive to high-frequency detail, and there is no ceiling on how much frequency you can pack into a function of height 1. By the central theorem, "unbounded" is the same as "discontinuous": you can have f_n \to 0 uniformly while f_n' does not converge at all — differentiation does not respect uniform limits. (This is precisely why analysis is so fussy about differentiating a series term by term, yet relaxed about integrating one: integration, the inverse operation, is a bounded, smoothing operator.)

Bounded operators are the well-behaved ones, but nature seems to prefer the wild kind. In quantum mechanics the position operator (\hat{x}\psi)(x) = x\,\psi(x) and the momentum operator \hat{p} = -i\hbar\,\frac{d}{dx} are both unbounded — \hat{p} because it is essentially differentiation, \hat{x} because multiplying by x can amplify a normalised wavefunction without limit. So are almost all the Hamiltonians (energy operators) that govern real systems.

This is not a technical nuisance you can wish away. The Hellinger–Toeplitz theorem says that a symmetric operator defined on all of a Hilbert space is automatically bounded. Contrapositive: an unbounded observable like energy or momentum simply cannot be defined on the whole space — it lives only on a dense domain of sufficiently nice states. The entire delicate theory of self-adjoint operators, spectra, and domains that underpins quantum mechanics exists precisely because the operators that matter are unbounded. Boundedness is the exception, not the rule.

See it: the unit circle becomes an ellipse

For a 2\times 2 matrix T = \begin{pmatrix} a & b \\ c & d \end{pmatrix} acting on (\mathbb{R}^2, \|\cdot\|_2), the operator norm has a beautifully visual meaning. The unit circle \|x\| = 1 (dashed) is carried by T to an ellipse (solid). The operator norm \|T\| = \sigma_{\max} is the length of the ellipse's longest semi-axis — the direction in which T stretches the hardest — while the shortest semi-axis is the smallest singular value \sigma_{\min}. Drag the four matrix entries and watch \|T\| track the maximum stretch. Notice the special cases: a pure rotation keeps the circle a circle (\sigma_{\max} = \sigma_{\min} = 1); make the matrix singular (ad = bc) and the ellipse collapses to a line segment, with \sigma_{\min} = 0.

The space B(X, Y) of bounded operators

Collect all bounded linear operators from X to Y into one set, written B(X, Y) (also \mathcal{L}(X, Y)). Adding operators and scaling them pointwise — (S + T)x = Sx + Tx, (\alpha T)x = \alpha(Tx) — keeps you inside the set, so B(X, Y) is a vector space. Remarkably, the operator norm makes it a normed space in its own right: a space whose points are operators.

\|\cdot\| is a norm on B(X, Y): it is non-negative and vanishes only for T = 0; it is homogeneous, \|\alpha T\| = |\alpha|\,\|T\|; and it satisfies the triangle inequality \|S + T\| \le \|S\| + \|T\|.
It is submultiplicative under composition: \|ST\| \le \|S\|\,\|T\| whenever the composite makes sense.
If Y is complete (a Banach space), then so is B(X, Y) — regardless of whether X is complete. In particular the dual space X^{*} = B(X, \mathbb{F}) of bounded linear functionals is always a Banach space.

The triangle inequality is a one-liner from the defining bound: for a unit vector x, \|(S+T)x\| \le \|Sx\| + \|Tx\| \le \|S\| + \|T\|, and taking the sup over \|x\| = 1 gives \|S + T\| \le \|S\| + \|T\|. Submultiplicativity is the same trick chained: \|STx\| \le \|S\|\,\|Tx\| \le \|S\|\,\|T\|\,\|x\|.

Why completeness of Y transfers. Suppose (T_n) is Cauchy in B(X, Y). For each fixed x, the defining bound \|T_n x - T_m x\| \le \|T_n - T_m\|\,\|x\| shows (T_n x) is Cauchy in Y; because Y is complete, it converges to some limit we define to be Tx. Linearity of T passes to the limit; a short \varepsilon-argument then shows T is bounded and that \|T_n - T\| \to 0. So the limit lives back inside B(X, Y) — the space is complete. This is the reason the dual space, and the whole edifice of Banach-space duality, gets off the ground: whatever X is, its space of continuous linear functionals is always as well-behaved as the scalar field.

When X = Y, you can not only add and scale operators but also compose them, and composition acts as a multiplication. So B(X) = B(X, X) is a vector space with a product — an algebra — and submultiplicativity \|ST\| \le \|S\|\,\|T\| plus completeness (when X is Banach) makes it a Banach algebra. This is the gateway to C^{*}-algebras and the operator-algebraic formulation of quantum theory. Note the product is generally non-commutative: ST \ne TS in general, exactly as matrices fail to commute — and in quantum mechanics that non-commutativity is the uncertainty principle.