The Contraction Mapping Theorem
Grab a calculator, put it in radians, type any number, and press cos over and over.
Watch what happens: the display bounces around at first, then settles, and after a dozen presses it
parks itself at 0.739085\ldots and refuses to budge. Every starting number
on Earth ends up there. You have just found the unique solution of
\cos x = x — a fixed point — not by algebra (there is no closed form) but
by repetition.
Why does hammering one button converge so reliably? Because \cos is a
contraction: it drags points closer together. Two nearby inputs come out even
nearer, and iterating a shrinking map squeezes any starting point inexorably toward one special place
that the map leaves alone. This is the Contraction Mapping Theorem — also called the
Banach fixed point theorem — and it is the beating heart of existence-and-uniqueness
proofs across analysis: differential equations, integral equations, the
inverse function theorem, even
the JPEG-style fractal image codecs that reconstruct a picture as the fixed point of a contraction.
The theorem's genius is that it does not just prove a solution exists — it hands you an
algorithm to compute it, plus a guarantee of how fast. Existence, uniqueness, and a numerical
method, all from one hypothesis.
The setting and the one hypothesis
Work in a
metric space
(X, d) — a set with a notion of distance. A map
T : X \to X is a contraction if there is a constant
k with 0 \le k < 1 such that
d\big(T(x), T(y)\big) \le k\, d(x, y) \qquad \text{for all } x, y \in X.
Read it plainly: applying T multiplies distances down by at least the
factor k. The strictness k < 1
is everything — a map that merely does not increase distances (k = 1) is
not enough, as we will see it fail. A fixed point is a point
x^\ast that the map pins in place:
T(x^\ast) = x^\ast.
Every contraction is automatically continuous (indeed Lipschitz): points close in input are
forced close in output. And it needs its home X to be
complete — every
Cauchy sequence
converges inside X, with no missing limit points. That
completeness is where the fixed point will actually come from.
Let (X, d) be a complete metric space and
T : X \to X a contraction with constant k < 1.
Then:
-
Existence & uniqueness: T has exactly one fixed
point x^\ast.
-
Convergence: for any start x_0, the iterates
x_{n+1} = T(x_n) converge to x^\ast.
-
Error bound:
d(x_n, x^\ast) \le \dfrac{k^n}{1 - k}\, d(x_0, x_1) — the error decays
geometrically, and you know the rate in advance.
Why it works — the proof in four moves
The argument is short and every hypothesis earns its place. Fix a start
x_0 and iterate x_{n+1} = T(x_n).
Move 1 — consecutive steps shrink geometrically. Applying the contraction bound to
one step,
d(x_{n+1}, x_n) = d\big(T(x_n), T(x_{n-1})\big) \le k\, d(x_n, x_{n-1}) \le \cdots \le k^n\, d(x_1, x_0).
Move 2 — the iterates are Cauchy. For m > n, chain the
triangle inequality and sum the geometric series:
d(x_m, x_n) \le \sum_{j=n}^{m-1} d(x_{j+1}, x_j) \le \big(k^n + k^{n+1} + \cdots\big)\,d(x_1, x_0) = \frac{k^n}{1 - k}\, d(x_1, x_0).
Since k < 1, the factor k^n \to 0, so the tail
is as small as we like: the sequence is
Cauchy. (Note where
k < 1 was spent — a bare k = 1 makes this series
diverge.)
Move 3 — completeness supplies the limit. A Cauchy sequence in a complete
space converges to some x^\ast \in X. Because T
is continuous, letting n \to \infty in
x_{n+1} = T(x_n) gives
x^\ast = T(x^\ast) — a fixed point. Send m \to \infty
in Move 2's bound to get the promised error estimate.
Move 4 — uniqueness. Suppose x^\ast and
y^\ast were both fixed. Then
d(x^\ast, y^\ast) = d(T(x^\ast), T(y^\ast)) \le k\, d(x^\ast, y^\ast), i.e.
(1 - k)\,d(x^\ast, y^\ast) \le 0. With
1 - k > 0 this forces
d(x^\ast, y^\ast) = 0, so
x^\ast = y^\ast. Two fixed points cannot coexist.
Watch the staircase converge
Here is the iteration made visible for a linear contraction T(x) = 1 + k(x - 1),
whose fixed point sits at x^\ast = 1 (where the bold map crosses the
dashed diagonal y = x). The cobweb reads the iteration
off the picture: go up to the curve to apply T, across to the diagonal to
recycle the output as the next input, and repeat.
Slide the contraction constant k toward 0 and
the staircase leaps to the fixed point in a step or two; push it toward 1
and it crawls, taking many timid steps — exactly the k^n rate the theorem
predicts. Make k negative and the cobweb spirals inward instead
of stepping, because each application overshoots and flips sides. Move the start
x_0 anywhere: every launch lands on the same point. That universal
convergence, independent of where you begin, is the theorem you can see.
The theorem asks for a strict contraction on a complete space, and each demand has
a counterexample lying in wait if you weaken it.
-
k = 1 is not good enough. The shift
T(x) = x + 1 on \mathbb{R} satisfies
|T(x) - T(y)| = |x - y| exactly — distances are preserved,
k = 1 — yet T(x) = x has no solution;
the iterates march off to infinity. Even a map with
|T(x) - T(y)| < |x - y| strictly for all pairs can fail:
T(x) = x + \tfrac1x on [1, \infty) shrinks
every gap but has no fixed point, because no single k < 1 works
uniformly. You need one k bounding all pairs at once.
-
Completeness is not optional. Take
T(x) = x/2 on the open interval
X = (0, 1] — a genuine contraction with
k = \tfrac12. The iterates
1, \tfrac12, \tfrac14, \ldots are Cauchy but head for
0, which was evicted from the space. No limit in
X means no fixed point in X. The hole where
0 should be is exactly the gap that completeness forbids.
Where it earns its keep
Solving \cos x = x. On the interval
[0, 1], T(x) = \cos x has
|T'(x)| = |\sin x| \le \sin 1 \approx 0.84 < 1, so it is a contraction on a
complete interval. Banach guarantees the calculator experiment converges, to the one and only
solution — the Dottie number 0.739085\ldots
Newton's method as a super-contraction. Newton's iteration
x_{n+1} = x_n - f(x_n)/f'(x_n) is fixed-point iteration for a cleverly
chosen T. Near a simple root T'(x^\ast) = 0, so
the local contraction constant is effectively 0 — which is why Newton
doubles its correct digits each step instead of merely gaining a fixed fraction.
Differential equations exist. The
\textbf{Picard–Lindelöf} theorem — that
y' = F(x, y) with y(x_0) = y_0 has a unique
solution when F is Lipschitz in y — is the
Contraction Mapping Theorem applied in a complete function space. The map is the integral
operator (Ty)(x) = y_0 + \int_{x_0}^x F(t, y(t))\,dt; its fixed point is
the solution curve. Existence of the solution to a huge swathe of physics is one contraction away.