The
quadratic variation
[X]_t measures how much a single process wiggles. Its two-process
cousin, the quadratic covariation [X, Y]_t,
measures how much two processes wiggle together. For a partition
0 = t_0 < t_1 < \dots < t_n = t of [0, t]
it is the limit of cross-products of increments as the mesh shrinks:
[X, Y]_t = \lim_{\|\Pi\|\to 0} \sum_{i} \big(X_{t_{i+1}} - X_{t_i}\big)\big(Y_{t_{i+1}} - Y_{t_i}\big) = \lim_{\|\Pi\|\to 0} \sum_i \Delta X_i\,\Delta Y_i.
It is built exactly like a covariance, but pathwise and in time. Two immediate structural
facts, straight from the definition: it is symmetric
(\Delta X_i \Delta Y_i = \Delta Y_i \Delta X_i, so
[X,Y] = [Y,X]) and bilinear (the sum is linear in
each factor, so [aX + bZ,\, Y] = a[X,Y] + b[Z,Y]). Setting
Y = X recovers the quadratic variation
[X, X]_t = [X]_t.
The covariation of two Itô processes, line by line
Let two Itô processes be driven by the same Brownian motion
W:
dX_t = a^X\,dt + b^X\,dW_t, \qquad dY_t = a^Y\,dt + b^Y\,dW_t.
We want d[X, Y]_t, which by the definition is the leading part of
the cross-product of increments dX_t\,dY_t. Evaluate that product
with the box algebra
(dW_t)^2 = dt,
dt\,dW_t = 0,
(dt)^2 = 0.
Step 1 — expand the product term by term:
dX_t\,dY_t = \big(a^X\,dt + b^X\,dW_t\big)\big(a^Y\,dt + b^Y\,dW_t\big),
= a^X a^Y\,(dt)^2 + a^X b^Y\,dt\,dW_t + b^X a^Y\,dW_t\,dt + b^X b^Y\,(dW_t)^2.
Step 2 — kill the small terms. Three of the four products are negligible by
the box algebra: (dt)^2 = 0 and both mixed
dt\,dW_t = 0. Only the last survives, with
(dW_t)^2 = dt:
dX_t\,dY_t = b^X b^Y\,(dW_t)^2 = b^X b^Y\,dt.
Step 3 — read off the covariation (its increment is this leading term):
\boxed{\,d[X, Y]_t = b^X b^Y\,dt, \qquad [X, Y]_t = \int_0^t b^X_s\,b^Y_s\,ds.\,}
Only the diffusion coefficients enter — the drifts a^X, a^Y
contribute nothing, because finite-variation (drift) parts are too smooth to accumulate
covariation. Setting X = Y = W (so
b^X = b^Y = 1) gives the headline special case
[W, W]_t = t.
Polarisation: covariation from variations alone
There is a slick way to define [X, Y] using only single-process
quadratic variations, the polarisation identity. It is the same algebra as
pq = \tfrac14[(p+q)^2 - (p-q)^2], or equivalently:
Step 1 — expand the quadratic variation of the sum, using bilinearity:
[X + Y,\, X + Y] = [X, X] + 2[X, Y] + [Y, Y].
Step 2 — solve for the cross term:
[X, Y] = \tfrac{1}{2}\Big([X + Y] - [X] - [Y]\Big).
So covariation is fully determined by the variations of X,
Y and their sum — no new machinery required.
Independent Brownian motions: [W^1, W^2]_t = 0
Now let W^1, W^2 be independent Brownian motions. Their
covariation is the limit of \sum_i \Delta W^1_i\,\Delta W^2_i.
Step 1 — each summand has mean zero. Over the step
[t_i, t_{i+1}] the increments
\Delta W^1_i, \Delta W^2_i are independent and each mean-zero, so
\mathbb{E}\big[\Delta W^1_i\,\Delta W^2_i\big] = \mathbb{E}[\Delta W^1_i]\,\mathbb{E}[\Delta W^2_i] = 0\cdot 0 = 0.
Step 2 — the sum has mean zero and vanishing variance. The expected sum is
0, and a short computation shows its variance is
\sum_i (t_{i+1}-t_i)^2 \le \|\Pi\|\,t \to 0 as the mesh shrinks. A
mean-zero quantity whose variance goes to zero converges (in
L^2) to 0:
[W^1, W^2]_t = 0.
In box-algebra shorthand this is the rule dW^1_t\,dW^2_t = 0 for
independent drivers — the multidimensional twin of
(dW_t)^2 = dt.
For Itô processes X, Y the quadratic covariation
[X, Y]_t = \lim \sum_i \Delta X_i\,\Delta Y_i satisfies:
-
Symmetric and bilinear: [X, Y] = [Y, X] and
[aX + bZ,\, Y] = a[X, Y] + b[Z, Y].
-
Diffusion rule: if
dX = a^X dt + b^X dW and
dY = a^Y dt + b^Y dW (same W), then
d[X, Y]_t = b^X b^Y\,dt.
-
Self-covariation:
[W, W]_t = t.
-
Independent drivers: for independent Brownian motions
W^1, W^2,
[W^1, W^2]_t = 0.
The polarisation identity above is worth seeing once with every bracket expanded. Starting
from [X+Y] and using bilinearity and symmetry:
[X+Y] = [X+Y,\,X+Y] = [X,X] + [X,Y] + [Y,X] + [Y,Y] = [X] + 2[X,Y] + [Y],
and isolating the middle term gives
[X,Y] = \tfrac12([X+Y] - [X] - [Y]), exactly as claimed. This is
the stochastic echo of the variance identity
\operatorname{Cov}(P, Q) = \tfrac12(\operatorname{Var}(P+Q) - \operatorname{Var}(P) - \operatorname{Var}(Q)).
Between the two extremes — perfectly coupled (same W) and
independent ([W^1,W^2]=0) — sit correlated
Brownian motions with correlation \rho \in [-1, 1]. Their box
rule interpolates:
dW^1_t\,dW^2_t = \rho\,dt, \qquad\text{so}\qquad [W^1, W^2]_t = \rho\,t.
This is the seed of the multidimensional Itô lemma: when a function depends on
several correlated drivers, every pair contributes a cross-correction
f_{x_i x_j}\,d[X^i, X^j], and the matrix of these
\rho_{ij} values is what couples a basket of assets together in
multi-asset pricing.