Itô's Lemma
Ordinary calculus has a chain rule: if x(t) is smooth and
f is smooth, then
df(x) = f'(x)\,dx. Brownian motion breaks this, because its paths
are so jagged that (dW_t)^2 does not vanish — it behaves
like dt. Itô's lemma is the chain rule patched to
survive that fact: it carries one extra second-order term, the Itô correction,
and it is the single most-used tool in all of mathematical finance.
Take an
Itô process
— a process with a drift part and a
Brownian
part,
dX_t = a\,dt + b\,dW_t,
and a smooth function f. Then
Itô's lemma reads
df(X_t) = f'(X_t)\,dX_t + \tfrac{1}{2} f''(X_t)\,(dX_t)^2 = \Big[a\,f'(X_t) + \tfrac{1}{2}b^2 f''(X_t)\Big]dt + b\,f'(X_t)\,dW_t.
The first equality is just a second-order Taylor expansion; the second uses the
box algebra (dW_t)^2 = dt,
dt\,dW_t = 0, (dt)^2 = 0. The surviving
\tfrac12 b^2 f''\,dt is the Itô correction — the term ordinary
calculus never sees.
The pure-Brownian case, line by line
Start where the idea is cleanest: X_t = W_t itself, so
a = 0 and b = 1, and we expand
f(W_t). We will Taylor-expand, apply the box algebra, and discard
what is too small to matter — keeping ruthless track of orders in
dt.
Step 1 — Taylor-expand to second order. For a smooth
f and an increment dW = W_{t+dt} - W_t,
Taylor's theorem (see
Taylor series) gives
f(W_t + dW) = f(W_t) + f'(W_t)\,dW + \tfrac{1}{2}f''(W_t)\,(dW)^2 + \tfrac{1}{6}f'''(W_t)\,(dW)^3 + \cdots,
so the change in f is
df(W_t) = f'(W_t)\,dW + \tfrac{1}{2}f''(W_t)\,(dW)^2 + \tfrac{1}{6}f'''(W_t)\,(dW)^3 + \cdots.
Step 2 — weigh each term by its order in dt.
A Brownian increment scales like dW \sim \sqrt{dt} (its variance is
dt). So the terms have sizes
f'\,dW \sim \sqrt{dt}, \qquad \tfrac12 f''\,(dW)^2 \sim dt, \qquad \tfrac16 f'''\,(dW)^3 \sim dt^{3/2}, \;\dots
The first-order term is the largest, the second-order term is exactly of order
dt, and everything from the third order on is
o(dt) — negligible against dt as
dt \to 0. Drop them:
df(W_t) = f'(W_t)\,dW + \tfrac{1}{2}f''(W_t)\,(dW)^2 + o(dt).
Step 3 — apply the box algebra (dW)^2 = dt. This is
the heart of the matter: the squared increment is not random noise that averages out — to the
order we keep it equals the deterministic dt (this is precisely the
quadratic-variation fact
\langle W\rangle_t = t). Substituting,
df(W_t) = f'(W_t)\,dW + \tfrac{1}{2}f''(W_t)\,dt.
That is Itô's lemma in its purest form. Compare the ordinary chain rule
df = f'\,dW: the entire difference is the extra
\tfrac12 f''\,dt born from (dW)^2 = dt.
A worked example: d(W_t^2)
Take f(x) = x^2, so f'(x) = 2x and
f''(x) = 2. Feeding these into the formula above:
d(W_t^2) = f'(W_t)\,dW_t + \tfrac{1}{2}f''(W_t)\,dt = 2W_t\,dW_t + \tfrac{1}{2}\cdot 2\,dt,
\boxed{\,d(W_t^2) = 2W_t\,dW_t + dt.\,}
Naive calculus would have predicted only 2W_t\,dW_t; the lone
+\,dt is the Itô correction. Integrating both sides from
0 to t and rearranging recovers
W_t^2 = 2\int_0^t W_s\,dW_s + t \quad\Longleftrightarrow\quad \int_0^t W_s\,dW_s = \tfrac{1}{2}W_t^2 - \tfrac{1}{2}t,
the famous "\int W\,dW \ne \tfrac12 W^2" surprise — and the
+\,t here is the same compensator that makes
W_t^2 - t a
martingale.
The correction is literally the
quadratic variation
showing up as calculus.
The general Itô process, and time dependence
Now keep the full dX_t = a\,dt + b\,dW_t. Squaring with the box
algebra,
(dX_t)^2 = (a\,dt + b\,dW_t)^2 = a^2\,(dt)^2 + 2ab\,dt\,dW_t + b^2\,(dW_t)^2 = b^2\,dt,
because (dt)^2 = 0 and dt\,dW_t = 0, while
(dW_t)^2 = dt. Substituting into the second-order Taylor expansion
df = f'\,dX + \tfrac12 f''(dX)^2 and collecting the
dt and dW pieces:
df(X_t) = f'(X_t)(a\,dt + b\,dW_t) + \tfrac{1}{2}f''(X_t)\,b^2\,dt = \Big[a\,f'(X_t) + \tfrac{1}{2}b^2 f''(X_t)\Big]dt + b\,f'(X_t)\,dW_t.
Finally, let f depend on time too,
f = f(t, X_t). The Taylor expansion picks up the partial
f_t\,dt (the cross term f_{tx}\,dt\,dX and
f_{tt}(dt)^2 are both o(dt) and vanish; see
partial derivatives), giving the
full time-dependent form:
df(t, X_t) = \Big(f_t + a\,f_x + \tfrac{1}{2}b^2 f_{xx}\Big)dt + b\,f_x\,dW_t.
Let dX_t = a\,dt + b\,dW_t be an Itô process and
f twice continuously differentiable. Then:
-
Function of the process.
df(X_t) = \Big[a\,f'(X_t) + \tfrac{1}{2}b^2 f''(X_t)\Big]dt + b\,f'(X_t)\,dW_t,
with the extra \tfrac12 b^2 f''\,dt being the
Itô correction, born from (dW_t)^2 = dt.
-
Pure-Brownian case (a=0,\,b=1):
df(W_t) = f'(W_t)\,dW_t + \tfrac{1}{2}f''(W_t)\,dt.
-
Time-dependent form. For f = f(t, x),
df(t, X_t) = \Big(f_t + a\,f_x + \tfrac{1}{2}b^2 f_{xx}\Big)dt + b\,f_x\,dW_t.
In ordinary calculus a Taylor expansion keeps only the first-order term: the second-order
piece \tfrac12 f''(dx)^2 is of order
(dx)^2 = o(dx) and is thrown away. The whole novelty of Itô
calculus is that for Brownian motion this is no longer true, because the increment is
square-root small, not linearly small:
dW \sim \sqrt{dt} \;\Longrightarrow\; (dW)^2 \sim dt, \quad (dW)^3 \sim dt^{3/2} = o(dt), \quad (dW)^4 \sim dt^2 = o(dt).
So the second-order term is promoted to first-class status — it is genuinely of order
dt and must be kept — while the third- and higher-order terms are
o(dt) and still die. Itô's lemma is exactly "ordinary chain rule,
plus the one second-order term that refuses to vanish".
A footnote for the curious: there is a rival calculus, the
Stratonovich integral, which uses a symmetric midpoint rule and obeys the
ordinary chain rule with no correction term,
d f(W_t) = f'(W_t)\circ dW_t. It is convenient in physics, but
its integral is not a
martingale,
which is why finance keeps the Itô version and pays the price of the
\tfrac12 b^2 f''\,dt term.
Seeing it: a function of a Brownian path
Below is one Brownian path W_t together with
W_t^2 — the function whose Itô differential
d(W_t^2) = 2W_t\,dW_t + dt we just derived. Notice
W_t^2 drifts gently upward even though W_t
has no drift: that upward lean is exactly the accumulated +\,dt
correction. Refresh to draw a fresh \omega.