The Itô Integral

We saw on the previous page that ordinary calculus cannot define \int_0^T H\, dW path by path: the Riemann sum depends on where you sample. The Itô integral resolves this by committing to the left endpoint and then building the integral in two careful stages — first for simple step-strategies, then by a limit for everything else.

Throughout, (W_t) is a Brownian motion adapted to a filtration (\mathcal{F}_t), and the integrand H is adapted: H_t is \mathcal{F}_t-measurable — known from the information available at time t, with no peeking into the future. This single word, "adapted", is what makes everything below work.

Stage 1 — simple adapted processes

Start with the strategies you could actually trade: hold a fixed (random but already-known) position over each of a finite set of time-slots, rebalancing only at the partition times 0 = t_0 < t_1 < \cdots < t_n = T. Formally, a simple adapted process is

H_s = \sum_{i=0}^{n-1} H_{t_i}\,\mathbf{1}_{(t_i,\, t_{i+1}]}(s),

where each coefficient H_{t_i} is \mathcal{F}_{t_i}-measurable — the position you set at the start of the slot (t_i, t_{i+1}], knowing only the past. For such a process there is one natural definition of the integral: in each slot, multiply the held position by the Brownian increment over that slot, and add up.

\int_0^T H_s\, dW_s \;:=\; \sum_{i=0}^{n-1} H_{t_i}\,\big(W_{t_{i+1}} - W_{t_i}\big).

The crucial feature is that the coefficient is H_{t_i} — the left endpoint of the slot. The position is fixed before the increment \Delta W_i = W_{t_{i+1}} - W_{t_i} is revealed, so H_{t_i} is independent of its own increment. That independence is the engine of every property that follows.

The integral has mean zero — line by line

The first dividend of the left-endpoint choice: the integral of any simple adapted process is mean-zero. We prove \mathbb{E}\big[\int_0^T H\, dW\big] = 0 with no skipped steps.

Step 1 — expectation of the sum is the sum of expectations. Expectation is linear, so push it through the finite sum term by term:

\mathbb{E}\!\left[\int_0^T H\, dW\right] = \mathbb{E}\!\left[\sum_{i=0}^{n-1} H_{t_i}\,\Delta W_i\right] = \sum_{i=0}^{n-1} \mathbb{E}\big[\,H_{t_i}\,\Delta W_i\,\big].

Step 2 — condition each term on the past. Fix a term and use the tower property, conditioning on \mathcal{F}_{t_i} — everything known when the position was set:

\mathbb{E}\big[\,H_{t_i}\,\Delta W_i\,\big] = \mathbb{E}\Big[\,\mathbb{E}\big[\,H_{t_i}\,\Delta W_i \,\big|\, \mathcal{F}_{t_i}\,\big]\,\Big].

Step 3 — pull out the known position. The coefficient H_{t_i} is \mathcal{F}_{t_i}-measurable, so it is a constant inside the inner conditional expectation and factors out:

\mathbb{E}\big[\,H_{t_i}\,\Delta W_i \,\big|\, \mathcal{F}_{t_i}\,\big] = H_{t_i}\,\mathbb{E}\big[\,\Delta W_i \,\big|\, \mathcal{F}_{t_i}\,\big].

Step 4 — the future increment has conditional mean zero. The increment \Delta W_i = W_{t_{i+1}} - W_{t_i} is independent of \mathcal{F}_{t_i} and distributed N(0,\, t_{i+1} - t_i), so conditioning changes nothing and its mean is 0:

\mathbb{E}\big[\,\Delta W_i \,\big|\, \mathcal{F}_{t_i}\,\big] = \mathbb{E}[\Delta W_i] = 0.

Step 5 — every term vanishes, so the sum does. Combining Steps 3 and 4, each conditional expectation is H_{t_i}\cdot 0 = 0, hence each term is 0, hence

\mathbb{E}\!\left[\int_0^T H\, dW\right] = \sum_{i=0}^{n-1} H_{t_i}\cdot 0 = 0.

Notice where adaptedness did the work: in Step 3 we factored out H_{t_i} only because it was already known, and in Step 4 the increment was independent of it. Had we sampled at the right endpoint, H_{t_{i+1}} would be correlated with \Delta W_i and the term would not vanish — the integral would carry a drift.

Let H_s = \sum_{i=0}^{n-1} H_{t_i}\mathbf{1}_{(t_i, t_{i+1}]}(s) be a simple adapted process, each H_{t_i} being \mathcal{F}_{t_i}-measurable and square-integrable. Its Itô integral is defined by the left-endpoint sum \int_0^T H_s\, dW_s = \sum_{i=0}^{n-1} H_{t_i}\,\big(W_{t_{i+1}} - W_{t_i}\big), and it satisfies:

Zero mean: \mathbb{E}\big[\int_0^T H\, dW\big] = 0.
Linearity: \int (aH + bK)\, dW = a\int H\, dW + b\int K\, dW.
Martingale (foreshadowed): the running integral M_t = \int_0^t H\, dW is itself a martingale in t — a fair bet against a fair game stays fair.

The left-endpoint choice is not arbitrary tidiness; it is the whole point. A trading strategy H must be non-anticipating: the position you hold over (t_i, t_{i+1}] can depend on everything up to t_i, but not on the move \Delta W_i the market is about to make. You bet, then the dice roll.

Mathematically, that means the coefficient is independent of its increment, which is exactly what made \mathbb{E}[H_{t_i}\Delta W_i] = 0 in Step 4 — and, slot by slot, makes the running integral a martingale. Had we used the right endpoint H_{t_{i+1}}, the coefficient would be correlated with the very increment it multiplies; the product would have positive mean, and the integral would secretly accumulate gains — the "free lunch" that risk-neutral pricing exists to forbid. So the left endpoint is the unique choice for which "fair bet on a fair game" remains fair. (The midpoint — Stratonovich — is symmetric and obeys the ordinary chain rule, but it peeks, so it is not a martingale and not used to model a self-financing strategy.)

Stage 2 — general adapted integrands, by approximation

Real integrands are not step functions. The extension is the standard analyst's move: approximate a general adapted process by simple ones and pass to a limit. Let H be adapted and square-integrable,

H \in L^2 \quad\Longleftrightarrow\quad \mathbb{E}\!\left[\int_0^T H_s^2\, ds\right] < \infty.

One can choose simple adapted processes H^{(m)} that approximate H in this L^2(dt \times d\mathbb{P}) sense,

\mathbb{E}\!\left[\int_0^T \big(H^{(m)}_s - H_s\big)^2\, ds\right] \longrightarrow 0,

and define the Itô integral of H as the limit of the simple integrals,

\int_0^T H\, dW \;:=\; \lim_{m\to\infty} \int_0^T H^{(m)}\, dW \qquad (\text{limit in } L^2(\Omega)).

For this to be a sound definition the simple integrals must form a Cauchy sequence in L^2(\Omega), and the limit must not depend on which approximating sequence we picked. Both are guaranteed by a single remarkable identity — the Itô isometry, the subject of the next page — which converts the size of the integral into the size of the integrand, so an approximation that is close in the integrand sense produces integrals that are close in L^2(\Omega).

Watch the integral accumulate

Below, a Brownian path drives a simple step strategy H (a piecewise constant position, set fresh at each rebalancing time). The bars show the held position H_{t_i} in each slot; the marked points trace the running integral \sum_{j \le i} H_{t_j}\,\Delta W_j — the left-endpoint position times the increment, summed up. Refresh to draw a fresh path and strategy.