An ordinary
differential equation
describes how a quantity changes in terms of its current value. A
stochastic differential equation (SDE) does the same, but adds a noise term —
a push from
Brownian motion.
The canonical form is
dX_t = a(t, X_t)\,dt + b(t, X_t)\,dW_t, \qquad X_0 \text{ given.}
Two terms, two roles. The drift a(t, X_t) is the
deterministic pull — where the process tends to go. The diffusion
b(t, X_t) is the size of the random kick supplied by
dW_t. The crucial feature is that both coefficients depend on the
current state X_t: the process steers its own drift
and scales its own noise as it moves. That feedback is what makes SDEs expressive enough to
model interest rates, volatilities, and prices.
What "solving" means: the integral form
The differential dX_t is shorthand. Brownian paths are nowhere
differentiable, so dX_t = a\,dt + b\,dW_t has no meaning as a ratio
of differentials. Its rigorous content is the integral form: a process
(X_t) is a solution when it is an
Itô process
satisfying
X_t = X_0 + \int_0^t a(s, X_s)\,ds + \int_0^t b(s, X_s)\,dW_s
for every t \ge 0. The first integral is an ordinary
(Lebesgue)
integral in time; the second is an Itô integral against Brownian motion. The
SDE is just this identity written in differential shorthand — never lose sight of the fact that
the integral form is the real object.
Suppose the coefficients a(t, x) and
b(t, x) satisfy, for constants
K, L and all x, y:
-
Lipschitz in x:
|a(t,x) - a(t,y)| + |b(t,x) - b(t,y)| \le L\,|x - y| — the
coefficients do not change too fast as the state moves;
-
Linear growth:
|a(t,x)| + |b(t,x)| \le K\,(1 + |x|) — they grow at most linearly,
so the solution cannot blow up to infinity in finite time;
-
X_0 has finite second moment and is independent of the driving
Brownian motion.
Then the SDE has a unique strong solution
(X_t) — a process adapted to the Brownian filtration, with
continuous paths, satisfying the integral form, and pathwise unique.
There are two notions of solving an SDE, and the distinction matters. A
strong solution is built on a given Brownian motion
W: the process X is a function of
that specific path of the noise, adapted to the filtration
W generates. Pathwise uniqueness means the same noise always
produces the same trajectory.
A weak solution asks only for a process with the right
law: you may choose the probability space and the Brownian motion to suit, and two
weak solutions agree in distribution but need not agree path by path. Strong solutions are
weak, but not conversely. The Lipschitz + linear-growth hypotheses above are exactly what buy
you the stronger, pathwise statement — coefficients that are merely continuous can still admit
a weak solution while losing pathwise uniqueness.
The solution method: transform until integrable
Few SDEs can be solved in closed form, but those that can usually yield to a single trick:
guess a transform Y = g(X), apply
Itô's lemma
to it, and choose g so the resulting SDE for
Y has constant coefficients — a process you can
integrate directly. The template is always the same; only the choice of
g changes. Let us see it twice.
Example 1: additive noise (the trivial case)
Step 1 — write the SDE. The simplest non-trivial SDE has constant drift and
diffusion:
dX_t = \mu\,dt + \sigma\,dW_t.
Step 2 — no transform is needed — the coefficients are already constant. Move
to the integral form directly:
X_t = X_0 + \int_0^t \mu\,ds + \int_0^t \sigma\,dW_s.
Step 3 — integrate. The first integral is
\mu t; the second is \sigma times the
increment W_t - W_0 = W_t (the Itô integral of a constant is just the
constant times the Brownian increment):
X_t = X_0 + \mu t + \sigma W_t.
This is arithmetic Brownian motion — a straight-line drift with a Brownian
wobble. It is Gaussian, and (being a sum of a constant and a normal) it can go negative, which
is exactly why it is the wrong model for a price.
Example 2: multiplicative noise (the log-transform)
Step 1 — write the SDE. Now let the coefficients scale with the state:
dX_t = \mu X_t\,dt + \sigma X_t\,dW_t.
The coefficients \mu X and \sigma X are
not constant, so we cannot integrate directly. We need a transform.
Step 2 — guess the transform. Because the noise is multiplicative, the natural
guess is the logarithm, Y = g(X) = \ln X, with
g'(x) = 1/x and g''(x) = -1/x^2.
Step 3 — apply Itô's lemma. For Y = g(X), the lemma
gives dY = g'(X)\,dX + \tfrac12 g''(X)\,(dX)^2. Substitute the
derivatives:
dY = \frac{1}{X}\,dX - \frac{1}{2}\,\frac{1}{X^2}\,(dX)^2.
Step 4 — substitute dX = \mu X\,dt + \sigma X\,dW
and (dX)^2 = \sigma^2 X^2\,dt (the box rule
(dW)^2 = dt, dropping dt-products):
dY = \frac{1}{X}\big(\mu X\,dt + \sigma X\,dW\big) - \frac{1}{2}\,\frac{1}{X^2}\,\sigma^2 X^2\,dt.
Step 5 — cancel the X's. Every factor of
X divides out, leaving constant coefficients:
dY = \Big(\mu - \tfrac12\sigma^2\Big)\,dt + \sigma\,dW.
The transform worked: Y = \ln X obeys an SDE of the additive kind
from Example 1, with constant drift \mu - \tfrac12\sigma^2 and
constant diffusion \sigma. Integrating it and exponentiating back to
X is the
geometric Brownian motion
solution — we set up the machine here and finish the job there.
When no transform tames the SDE, simulate it. The Euler–Maruyama scheme is the
stochastic cousin of Euler's method: chop [0, t] into steps of size
\Delta t and step forward by replacing the differentials with their
discrete increments,
X_{k+1} = X_k + a(t_k, X_k)\,\Delta t + b(t_k, X_k)\,\sqrt{\Delta t}\,Z_k,
where each Z_k \sim N(0,1) is an independent standard normal draw.
The \sqrt{\Delta t} — not \Delta t — is the
signature of Brownian noise: the increment \Delta W = W_{t_{k+1}} - W_{t_k}
is N(0, \Delta t), so it has standard deviation
\sqrt{\Delta t}. Get that exponent wrong and the simulated process has
the wrong roughness. The interactive below is exactly this recursion in action.