We saw on the previous page that
ordinary calculus cannot define
\int_0^T H\, dW path by path: the Riemann sum depends on where you
sample. The
Itô
integral resolves this by committing to the left endpoint and then
building the integral in two careful stages — first for simple step-strategies, then by a
limit for everything else.
Throughout, (W_t) is a Brownian motion adapted to a filtration
(\mathcal{F}_t), and the integrand H is
adapted: H_t is
\mathcal{F}_t-measurable — known from the information available
at time t, with no peeking into the future. This single
word, "adapted", is what makes everything below work.
Stage 1 — simple adapted processes
Start with the strategies you could actually trade: hold a fixed (random but already-known)
position over each of a finite set of time-slots, rebalancing only at the partition times
0 = t_0 < t_1 < \cdots < t_n = T. Formally, a
simple adapted process is
H_s = \sum_{i=0}^{n-1} H_{t_i}\,\mathbf{1}_{(t_i,\, t_{i+1}]}(s),
where each coefficient H_{t_i} is
\mathcal{F}_{t_i}-measurable — the position you set at the
start of the slot (t_i, t_{i+1}], knowing only the past.
For such a process there is one natural definition of the integral: in each slot, multiply the
held position by the Brownian increment over that slot, and add up.
\int_0^T H_s\, dW_s \;:=\; \sum_{i=0}^{n-1} H_{t_i}\,\big(W_{t_{i+1}} - W_{t_i}\big).
The crucial feature is that the coefficient is H_{t_i} — the
left endpoint of the slot. The position is fixed before the increment
\Delta W_i = W_{t_{i+1}} - W_{t_i} is revealed, so
H_{t_i} is independent of its own increment. That independence is
the engine of every property that follows.
The integral has mean zero — line by line
The first dividend of the left-endpoint choice: the integral of any simple adapted process is
mean-zero. We prove
\mathbb{E}\big[\int_0^T H\, dW\big] = 0 with no skipped steps.
Step 1 — expectation of the sum is the sum of expectations. Expectation is
linear, so push it through the finite sum term by term:
\mathbb{E}\!\left[\int_0^T H\, dW\right] = \mathbb{E}\!\left[\sum_{i=0}^{n-1} H_{t_i}\,\Delta W_i\right] = \sum_{i=0}^{n-1} \mathbb{E}\big[\,H_{t_i}\,\Delta W_i\,\big].
Step 2 — condition each term on the past. Fix a term and use the tower
property, conditioning on \mathcal{F}_{t_i} — everything known when
the position was set:
\mathbb{E}\big[\,H_{t_i}\,\Delta W_i\,\big] = \mathbb{E}\Big[\,\mathbb{E}\big[\,H_{t_i}\,\Delta W_i \,\big|\, \mathcal{F}_{t_i}\,\big]\,\Big].
Step 3 — pull out the known position. The coefficient
H_{t_i} is \mathcal{F}_{t_i}-measurable,
so it is a constant inside the inner conditional expectation and factors out:
\mathbb{E}\big[\,H_{t_i}\,\Delta W_i \,\big|\, \mathcal{F}_{t_i}\,\big] = H_{t_i}\,\mathbb{E}\big[\,\Delta W_i \,\big|\, \mathcal{F}_{t_i}\,\big].
Step 4 — the future increment has conditional mean zero. The increment
\Delta W_i = W_{t_{i+1}} - W_{t_i} is independent of
\mathcal{F}_{t_i} and distributed
N(0,\, t_{i+1} - t_i), so conditioning changes nothing and its mean
is 0:
\mathbb{E}\big[\,\Delta W_i \,\big|\, \mathcal{F}_{t_i}\,\big] = \mathbb{E}[\Delta W_i] = 0.
Step 5 — every term vanishes, so the sum does. Combining Steps 3 and 4, each
conditional expectation is H_{t_i}\cdot 0 = 0, hence each term is
0, hence
\mathbb{E}\!\left[\int_0^T H\, dW\right] = \sum_{i=0}^{n-1} H_{t_i}\cdot 0 = 0.
Notice where adaptedness did the work: in Step 3 we factored out
H_{t_i} only because it was already known, and in Step 4 the
increment was independent of it. Had we sampled at the right endpoint,
H_{t_{i+1}} would be correlated with
\Delta W_i and the term would not vanish — the integral would carry
a drift.
Let H_s = \sum_{i=0}^{n-1} H_{t_i}\mathbf{1}_{(t_i, t_{i+1}]}(s) be
a simple adapted process, each H_{t_i} being
\mathcal{F}_{t_i}-measurable and square-integrable. Its Itô
integral is defined by the left-endpoint sum
\int_0^T H_s\, dW_s = \sum_{i=0}^{n-1} H_{t_i}\,\big(W_{t_{i+1}} - W_{t_i}\big),
and it satisfies:
-
Zero mean:
\mathbb{E}\big[\int_0^T H\, dW\big] = 0.
-
Linearity:
\int (aH + bK)\, dW = a\int H\, dW + b\int K\, dW.
-
Martingale (foreshadowed): the running integral
M_t = \int_0^t H\, dW is itself a martingale in
t — a fair bet against a fair game stays fair.
The left-endpoint choice is not arbitrary tidiness; it is the whole point. A trading strategy
H must be non-anticipating: the position you hold
over (t_i, t_{i+1}] can depend on everything up to
t_i, but not on the move \Delta W_i
the market is about to make. You bet, then the dice roll.
Mathematically, that means the coefficient is independent of its increment, which is exactly
what made \mathbb{E}[H_{t_i}\Delta W_i] = 0 in Step 4 — and, slot by
slot, makes the running integral a martingale. Had we used the right endpoint
H_{t_{i+1}}, the coefficient would be correlated with the very
increment it multiplies; the product would have positive mean, and the integral would secretly
accumulate gains — the "free lunch" that risk-neutral pricing exists to forbid. So the left
endpoint is the unique choice for which "fair bet on a fair game" remains fair. (The
midpoint — Stratonovich — is symmetric and obeys the ordinary chain rule, but it peeks, so it
is not a martingale and not used to model a self-financing strategy.)
Stage 2 — general adapted integrands, by approximation
Real integrands are not step functions. The extension is the standard analyst's move: approximate
a general adapted process by simple ones and pass to a limit. Let
H be adapted and square-integrable,
H \in L^2 \quad\Longleftrightarrow\quad \mathbb{E}\!\left[\int_0^T H_s^2\, ds\right] < \infty.
One can choose simple adapted processes H^{(m)} that approximate
H in this L^2(dt \times d\mathbb{P}) sense,
\mathbb{E}\!\left[\int_0^T \big(H^{(m)}_s - H_s\big)^2\, ds\right] \longrightarrow 0,
and define the Itô integral of H as the limit of the simple
integrals,
\int_0^T H\, dW \;:=\; \lim_{m\to\infty} \int_0^T H^{(m)}\, dW \qquad (\text{limit in } L^2(\Omega)).
For this to be a sound definition the simple integrals must form a
Cauchy sequence in L^2(\Omega), and the limit must not depend
on which approximating sequence we picked. Both are guaranteed by a single remarkable identity —
the Itô isometry,
the subject of the next page — which converts the size of the integral into the size of the
integrand, so an approximation that is close in the integrand sense produces integrals that are
close in L^2(\Omega).