Stochastic Processes
A stochastic process is a whole family of random variables
(X_t)_{t \in T}, all living on one
probability space (\Omega, \mathcal{F}, \mathbb{P})
and indexed by a parameter t we read as time. Where
a single random variable models one uncertain number, a process models an uncertain quantity
that evolves: a price, a queue length, a particle's position.
The index set T may be discrete —
T = \{0, 1, 2, \dots\}, so we write X_n —
or continuous, T = [0, \infty). The
state space S is where the values live (often
\mathbb{Z} or \mathbb{R}): each
X_t : \Omega \to S.
Two ways to look at it
A process has two faces, and switching between them is the whole art. Fix a time
t and you are left with a single
random variable X_t — a snapshot, with its own
distribution. Fix an outcome \omega \in \Omega
instead and the randomness is spent; what remains is one deterministic curve
t \longmapsto X_t(\omega),
a single sample path (or trajectory) — one realised history of the world.
Nature draws a single \omega once and for all; we then watch the
path it traces out. The figure below draws several such paths of a simple random walk; each
coloured trajectory is a different \omega. Refresh to let nature
draw again.
A function of two arguments
The two viewpoints are really the two ways of holding still one object: a process is a
function of two arguments, time and chance,
X : T \times \Omega \to S, \qquad (t, \omega) \longmapsto X_t(\omega).
Feed it both a time t and an outcome
\omega and you get a single value in the state space. Now freeze one
slot at a time:
-
Fix \omega: the map
t \mapsto X_t(\omega) is a deterministic curve — one
sample path, a function on T alone.
-
Fix t: the map
\omega \mapsto X_t(\omega) is a measurable function on
\Omega — a single random variable.
Neither slice is the whole story. A sample path forgets how likely it was; a single
X_t forgets how it relates to X_s at other
times. The relations between the snapshots — how X_{t_1}
and X_{t_2} move together — are what make a process more than a
bundle of unrelated random variables.
What a process really is: its finite-dimensional distributions
How much information pins a process down? You cannot write down the law of the whole
(uncountably infinite) path directly. The trick is to look at finitely many times at
once. Pick any finite list of times
t_1 < t_2 < \cdots < t_n in T and read off
the snapshots there. The result is a random vector
\big(X_{t_1}, X_{t_2}, \dots, X_{t_n}\big) \in S^n,
and that vector has a joint law on S^n — its
finite-dimensional distribution (an fdd):
\mu_{t_1, \dots, t_n}(B) \;=\; \mathbb{P}\!\left[\,(X_{t_1}, \dots, X_{t_n}) \in B\,\right], \qquad B \subseteq S^n.
The family of all these joint laws — one for every finite list of times — is what a
process really is, for almost every purpose. Two processes that match on every finite list of
times are, distributionally, the same object: every probability you can compute from finitely
many observations agrees. So when we say "a Brownian motion" or "a Poisson process" we are
naming a family of fdds, not one particular set of paths.
Two stochastic processes with the same state space have the same law precisely when
all their finite-dimensional distributions agree: for every finite list of times
t_1 < \cdots < t_n the joint laws
\mu_{t_1, \dots, t_n} coincide. (Informally: the fdds are the full
identikit of a process — nothing finite is left to check.)
The fdds cannot be chosen at random, though. They must be consistent: drop a
time from the list and the smaller joint law must be the
marginal
of the bigger one (observing fewer times can't change the law of the times you kept), and
permuting the times must permute the law the same way. These are the
Kolmogorov consistency conditions.
We have said an fdd family describes a process. The converse — that a suitable family
actually builds one — is a genuine theorem, and it is what lets us define processes by
listing their finite laws instead of constructing paths by hand.
Kolmogorov's extension theorem. Given a family of finite-dimensional
distributions \{\mu_{t_1, \dots, t_n}\} that satisfies the two
consistency conditions above (marginalisation and permutation), there exists a probability
space (\Omega, \mathcal{F}, \mathbb{P}) and a process
(X_t)_{t \in T} on it whose finite-dimensional distributions are
exactly the given \mu_{t_1, \dots, t_n}.
This is the licence behind every "let (W_t) be a Brownian motion":
one specifies the Gaussian fdds, checks consistency, and the theorem hands back an honest
process realising them — no explicit \Omega required.
One subtlety — versions and modifications. The fdds fix the law at finitely
many times, but they are silent about the fine structure of the paths. Two processes
(X_t) and (Y_t) with identical fdds are
called versions (or, when
\mathbb{P}[X_t = Y_t] = 1 at each fixed t,
modifications) of one another — yet one can have continuous paths and the
other not. The fdds choose the process up to its finite-time law; an extra argument (a path
regularity criterion, such as Kolmogorov's continuity theorem) is needed to pick the
continuous version we usually want.
Adapted to the flow of information
Time also carries information. A
filtration
(\mathcal{F}_t)_{t \in T} is an increasing family of σ-algebras,
\mathcal{F}_s \subseteq \mathcal{F}_t for
s \le t — the growing record of what is known by each time. A
process is adapted to (\mathcal{F}_t) when every
X_t is \mathcal{F}_t-measurable:
X_t \text{ is known by time } t \quad\text{for every } t.
No peeking into the future. A trading strategy may use today's price and all of history, but
not tomorrow's — adaptedness is the precise statement of that honesty. Classic examples: the
discrete random walk; the continuous
Bachelier–Wiener
Brownian motion; and the
Poisson process counting arrivals over time.