Itô's Lemma

Ordinary calculus has a chain rule: if x(t) is smooth and f is smooth, then df(x) = f'(x)\,dx. Brownian motion breaks this, because its paths are so jagged that (dW_t)^2 does not vanish — it behaves like dt. Itô's lemma is the chain rule patched to survive that fact: it carries one extra second-order term, the Itô correction, and it is the single most-used tool in all of mathematical finance.

Take an Itô process — a process with a drift part and a Brownian part,

dX_t = a\,dt + b\,dW_t,

and a smooth function f. Then Itô's lemma reads

df(X_t) = f'(X_t)\,dX_t + \tfrac{1}{2} f''(X_t)\,(dX_t)^2 = \Big[a\,f'(X_t) + \tfrac{1}{2}b^2 f''(X_t)\Big]dt + b\,f'(X_t)\,dW_t.

The first equality is just a second-order Taylor expansion; the second uses the box algebra (dW_t)^2 = dt, dt\,dW_t = 0, (dt)^2 = 0. The surviving \tfrac12 b^2 f''\,dt is the Itô correction — the term ordinary calculus never sees.

The pure-Brownian case, line by line

Start where the idea is cleanest: X_t = W_t itself, so a = 0 and b = 1, and we expand f(W_t). We will Taylor-expand, apply the box algebra, and discard what is too small to matter — keeping ruthless track of orders in dt.

Step 1 — Taylor-expand to second order. For a smooth f and an increment dW = W_{t+dt} - W_t, Taylor's theorem (see Taylor series) gives

f(W_t + dW) = f(W_t) + f'(W_t)\,dW + \tfrac{1}{2}f''(W_t)\,(dW)^2 + \tfrac{1}{6}f'''(W_t)\,(dW)^3 + \cdots,

so the change in f is

df(W_t) = f'(W_t)\,dW + \tfrac{1}{2}f''(W_t)\,(dW)^2 + \tfrac{1}{6}f'''(W_t)\,(dW)^3 + \cdots.

Step 2 — weigh each term by its order in dt. A Brownian increment scales like dW \sim \sqrt{dt} (its variance is dt). So the terms have sizes

f'\,dW \sim \sqrt{dt}, \qquad \tfrac12 f''\,(dW)^2 \sim dt, \qquad \tfrac16 f'''\,(dW)^3 \sim dt^{3/2}, \;\dots

The first-order term is the largest, the second-order term is exactly of order dt, and everything from the third order on is o(dt) — negligible against dt as dt \to 0. Drop them:

df(W_t) = f'(W_t)\,dW + \tfrac{1}{2}f''(W_t)\,(dW)^2 + o(dt).

Step 3 — apply the box algebra (dW)^2 = dt. This is the heart of the matter: the squared increment is not random noise that averages out — to the order we keep it equals the deterministic dt (this is precisely the quadratic-variation fact \langle W\rangle_t = t). Substituting,

df(W_t) = f'(W_t)\,dW + \tfrac{1}{2}f''(W_t)\,dt.

That is Itô's lemma in its purest form. Compare the ordinary chain rule df = f'\,dW: the entire difference is the extra \tfrac12 f''\,dt born from (dW)^2 = dt.

A worked example: d(W_t^2)

Take f(x) = x^2, so f'(x) = 2x and f''(x) = 2. Feeding these into the formula above:

d(W_t^2) = f'(W_t)\,dW_t + \tfrac{1}{2}f''(W_t)\,dt = 2W_t\,dW_t + \tfrac{1}{2}\cdot 2\,dt,

\boxed{\,d(W_t^2) = 2W_t\,dW_t + dt.\,}

Naive calculus would have predicted only 2W_t\,dW_t; the lone +\,dt is the Itô correction. Integrating both sides from 0 to t and rearranging recovers W_t^2 = 2\int_0^t W_s\,dW_s + t \quad\Longleftrightarrow\quad \int_0^t W_s\,dW_s = \tfrac{1}{2}W_t^2 - \tfrac{1}{2}t, the famous "\int W\,dW \ne \tfrac12 W^2" surprise — and the +\,t here is the same compensator that makes W_t^2 - t a martingale. The correction is literally the quadratic variation showing up as calculus.

The general Itô process, and time dependence

Now keep the full dX_t = a\,dt + b\,dW_t. Squaring with the box algebra,

(dX_t)^2 = (a\,dt + b\,dW_t)^2 = a^2\,(dt)^2 + 2ab\,dt\,dW_t + b^2\,(dW_t)^2 = b^2\,dt,

because (dt)^2 = 0 and dt\,dW_t = 0, while (dW_t)^2 = dt. Substituting into the second-order Taylor expansion df = f'\,dX + \tfrac12 f''(dX)^2 and collecting the dt and dW pieces:

df(X_t) = f'(X_t)(a\,dt + b\,dW_t) + \tfrac{1}{2}f''(X_t)\,b^2\,dt = \Big[a\,f'(X_t) + \tfrac{1}{2}b^2 f''(X_t)\Big]dt + b\,f'(X_t)\,dW_t.

Finally, let f depend on time too, f = f(t, X_t). The Taylor expansion picks up the partial f_t\,dt (the cross term f_{tx}\,dt\,dX and f_{tt}(dt)^2 are both o(dt) and vanish; see partial derivatives), giving the full time-dependent form:

df(t, X_t) = \Big(f_t + a\,f_x + \tfrac{1}{2}b^2 f_{xx}\Big)dt + b\,f_x\,dW_t.

Let dX_t = a\,dt + b\,dW_t be an Itô process and f twice continuously differentiable. Then:

Function of the process. df(X_t) = \Big[a\,f'(X_t) + \tfrac{1}{2}b^2 f''(X_t)\Big]dt + b\,f'(X_t)\,dW_t, with the extra \tfrac12 b^2 f''\,dt being the Itô correction, born from (dW_t)^2 = dt.
Pure-Brownian case (a=0,\,b=1): df(W_t) = f'(W_t)\,dW_t + \tfrac{1}{2}f''(W_t)\,dt.
Time-dependent form. For f = f(t, x), df(t, X_t) = \Big(f_t + a\,f_x + \tfrac{1}{2}b^2 f_{xx}\Big)dt + b\,f_x\,dW_t.

In ordinary calculus a Taylor expansion keeps only the first-order term: the second-order piece \tfrac12 f''(dx)^2 is of order (dx)^2 = o(dx) and is thrown away. The whole novelty of Itô calculus is that for Brownian motion this is no longer true, because the increment is square-root small, not linearly small:

dW \sim \sqrt{dt} \;\Longrightarrow\; (dW)^2 \sim dt, \quad (dW)^3 \sim dt^{3/2} = o(dt), \quad (dW)^4 \sim dt^2 = o(dt).

So the second-order term is promoted to first-class status — it is genuinely of order dt and must be kept — while the third- and higher-order terms are o(dt) and still die. Itô's lemma is exactly "ordinary chain rule, plus the one second-order term that refuses to vanish".

A footnote for the curious: there is a rival calculus, the Stratonovich integral, which uses a symmetric midpoint rule and obeys the ordinary chain rule with no correction term, d f(W_t) = f'(W_t)\circ dW_t. It is convenient in physics, but its integral is not a martingale, which is why finance keeps the Itô version and pays the price of the \tfrac12 b^2 f''\,dt term.

Seeing it: a function of a Brownian path

Below is one Brownian path W_t together with W_t^2 — the function whose Itô differential d(W_t^2) = 2W_t\,dW_t + dt we just derived. Notice W_t^2 drifts gently upward even though W_t has no drift: that upward lean is exactly the accumulated +\,dt correction. Refresh to draw a fresh \omega.