Multidimensional Itô's Lemma

Itô's lemma in one dimension promotes the chain rule by one term: a smooth function of a single Itô process picks up a \tfrac12 f'' correction because (dW_t)^2 = dt refuses to vanish. Finance, however, is rarely one-dimensional — a portfolio depends on several prices, an exchange-rate model couples two diffusions, a stochastic-volatility model has both a price and its volatility wandering at once. We need the lemma for a vector of processes.

Let X = (X^1, \dots, X^n) be a vector Itô process, each component driven by (possibly correlated) Brownian motions, and let f(t, x_1, \dots, x_n) be smooth. The single new ingredient compared to the scalar case is the family of cross-variations d[X_i, X_j] — the quadratic covariation between two of the driving processes. They are exactly the second-order terms a naive chain rule would discard, and exactly the terms that survive.

The 2-D case, derived line by line

Everything important is already visible with two processes, so let us do that case in full and in slow motion. Take two Itô processes X_t and Y_t and a smooth f(X, Y) (we suppress an explicit t-dependence for now — it would just add an f_t\,dt term that has no second-order partner). We want df = f(X + dX,\, Y + dY) - f(X, Y).

Step 1 — Taylor expand to second order. Ordinary calculus would stop at first order; the whole point of Itô calculus is that the second-order terms are not negligible, because squared increments are of order dt, not (dt)^2. So keep every term up to second order:

df = f_x\,dX + f_y\,dY + \tfrac12\Big( f_{xx}\,(dX)^2 + 2 f_{xy}\,dX\,dY + f_{yy}\,(dY)^2 \Big) + \cdots,

where the dots are genuinely negligible (third order and higher). The mixed partial appears twice — as f_{xy} and f_{yx} — and since f is smooth these are equal, giving the factor of 2.

Step 2 — substitute the box-algebra products. This is where stochastic calculus departs from the deterministic Taylor series. The products of differentials are not all zero; they are read off the covariation table. Each squared differential becomes a quadratic variation, and each cross product a quadratic covariation:

(dX)^2 = d[X], \qquad (dY)^2 = d[Y], \qquad dX\,dY = d[X, Y].

(Any product involving a dt — such as dt\,dX or (dt)^2 — is of order higher than dt and is dropped, which is exactly why an f_t\,dt term has no second-order partner.)

Step 3 — substitute (dX)^2 = d[X]:

\tfrac12 f_{xx}\,(dX)^2 = \tfrac12 f_{xx}\,d[X].

Step 4 — substitute the cross term dX\,dY = d[X, Y]. This is the new ingredient with no one-dimensional analogue:

\tfrac12 \cdot 2 f_{xy}\,dX\,dY = f_{xy}\,d[X, Y].

Step 5 — substitute (dY)^2 = d[Y]:

\tfrac12 f_{yy}\,(dY)^2 = \tfrac12 f_{yy}\,d[Y].

Step 6 — collect everything. Reassembling the first-order terms (which survive untouched) with the three substituted second-order terms gives the two-dimensional Itô formula:

df = f_x\,dX + f_y\,dY + \tfrac12\Big( f_{xx}\,d[X] + 2 f_{xy}\,d[X, Y] + f_{yy}\,d[Y] \Big).

Compared with the scalar lemma, the only addition is the middle term f_{xy}\,d[X, Y] — the channel through which the coupling of the two processes feeds into the dynamics of f. If X and Y were driven by independent Brownian motions this term would vanish and the two processes would not "talk"; correlation is precisely what keeps it alive.

Let X = (X^1, \dots, X^n) be a vector of Itô processes and f(t, x_1, \dots, x_n) be C^{1,2}. Then

df = f_t\,dt + \sum_{i=1}^{n} f_{x_i}\,dX_i + \tfrac12 \sum_{i=1}^{n}\sum_{j=1}^{n} f_{x_i x_j}\,d[X_i, X_j].

The second-order terms are governed by the box-algebra rules:

dX_i\,dX_j = d[X_i, X_j] — squared / cross differentials become quadratic (co)variations;
dt\,dX_i = 0 and (dt)^2 = 0 — anything multiplied by dt is higher order;
for independent Brownian drivers dW_i\,dW_j = 0 when i \ne j, and dW_i\,dW_i = dt.

Most multidimensional models do not use independent drivers — they use correlated ones. Two Brownian motions with correlation \rho obey the covariation rule

dW^1\,dW^2 = \rho\,dt,

which interpolates between independence (\rho = 0) and perfect lockstep (\rho = \pm 1). Now write a general vector diffusion as

dX_i = a_i\,dt + \sum_{k} b_{ik}\,dW^k,

with independent drivers W^k. Multiplying two such differentials and using dW^k\,dW^\ell = \delta_{k\ell}\,dt gives the diffusion (covariance) matrix

d[X_i, X_j] = \Big(\sum_{k} b_{ik}\,b_{jk}\Big)\,dt = (b\,b^{\mathsf T})_{ij}\,dt, \qquad \Sigma = b\,b^{\mathsf T}.

So the entire second-order structure of an n-dimensional Itô process is encoded in one symmetric, positive-semidefinite matrix \Sigma = b\,b^{\mathsf T} — the instantaneous covariance of the increments per unit time.

A worked cross term. Take f(X^1, X^2) = X^1 X^2, a product of two processes. Then f_{x_1} = X^2, f_{x_2} = X^1, f_{x_1 x_1} = f_{x_2 x_2} = 0, and f_{x_1 x_2} = 1. The lemma gives the Itô product rule:

d(X^1 X^2) = X^2\,dX^1 + X^1\,dX^2 + d[X^1, X^2].

The first two terms are the familiar Leibniz product rule; the extra d[X^1, X^2] = \Sigma_{12}\,dt is the correction that ordinary calculus lacks. If the two processes are driven by Brownian motions with correlation \rho and volatilities \sigma_1, \sigma_2, that term is \rho\,\sigma_1\sigma_2\,X^1 X^2\,dt.

See the coupling

Below are two Brownian paths W^1_t and W^2_t built from the same random shocks blended by a correlation \rho: W^2 = \rho\,W^1 + \sqrt{1 - \rho^2}\,\tilde W, with \tilde W an independent path. At \rho = 0 the two wander independently; near \rho \to 1 they lock together; near \rho \to -1 they mirror. The figure prints the \rho it drew — exactly the \rho in dW^1\,dW^2 = \rho\,dt. Refresh for a fresh correlation and a fresh pair of paths.