The Radon–Nikodym Derivative

Two people can disagree about probabilities without disagreeing about what is possible. A pessimist and an optimist may price the same coin differently, yet both agree the coin can land heads and can land tails. When two measures \mathbb{P} and \mathbb{Q} agree on exactly which events are impossible — the same null sets — we call them equivalent and write \mathbb{P} \sim \mathbb{Q}.

Equivalence is the precise licence to translate one measure into the other. The dictionary is a single non-negative random variable, the density (or Radon–Nikodym derivative)

Z = \frac{d\mathbb{Q}}{d\mathbb{P}} \ge 0,

which reweights \mathbb{P} into \mathbb{Q} outcome by outcome. Where Z > 1 the optimist counts an outcome as more likely than the pessimist; where 0 < Z < 1, less likely. Formally, \mathbb{Q} measures a set A by summing Z over it under \mathbb{P}:

\mathbb{Q}(A) = \int_A Z \, d\mathbb{P} \qquad \text{for every event } A.

From sets to expectations, derived line by line

The defining relation \mathbb{Q}(A) = \int_A Z\,d\mathbb{P} only talks about probabilities of sets. We will upgrade it into the full change-of-measure formula for expectations,

\mathbb{E}_{\mathbb{Q}}[X] = \mathbb{E}_{\mathbb{P}}[XZ],

by the standard ladder of integration theory: start with indicators, climb to simple functions, then to limits.

Step 1 — read the definition as an expectation. The integral of Z over A is exactly the \mathbb{P}-expectation of Z masked by the indicator \mathbf{1}_A (which is 1 on A and 0 off it):

\mathbb{Q}(A) = \int_A Z\,d\mathbb{P} = \mathbb{E}_{\mathbb{P}}[\mathbf{1}_A\, Z].

Step 2 — the indicator case of the formula. But the \mathbb{Q}-probability of A is itself the \mathbb{Q}-expectation of its indicator, \mathbb{Q}(A) = \mathbb{E}_{\mathbb{Q}}[\mathbf{1}_A]. Comparing with Step 1, the formula \mathbb{E}_{\mathbb{Q}}[X] = \mathbb{E}_{\mathbb{P}}[XZ] already holds for every X = \mathbf{1}_A:

\mathbb{E}_{\mathbb{Q}}[\mathbf{1}_A] = \mathbb{E}_{\mathbb{P}}[\mathbf{1}_A\, Z].

Step 3 — extend to simple functions by linearity. A simple random variable is a finite sum X = \sum_{k} c_k\,\mathbf{1}_{A_k}. Both expectations are linear, so applying Step 2 term by term,

\mathbb{E}_{\mathbb{Q}}[X] = \sum_k c_k\,\mathbb{E}_{\mathbb{Q}}[\mathbf{1}_{A_k}] = \sum_k c_k\,\mathbb{E}_{\mathbb{P}}[\mathbf{1}_{A_k} Z] = \mathbb{E}_{\mathbb{P}}\Big[\Big(\textstyle\sum_k c_k \mathbf{1}_{A_k}\Big) Z\Big] = \mathbb{E}_{\mathbb{P}}[XZ].

Step 4 — pass to the limit. Any non-negative measurable X is an increasing limit of simple functions X_n \uparrow X. Step 3 holds for each X_n, and monotone convergence lets us take the limit inside both expectations (note X_n Z \uparrow XZ since Z \ge 0):

\mathbb{E}_{\mathbb{Q}}[X] = \lim_{n\to\infty} \mathbb{E}_{\mathbb{Q}}[X_n] = \lim_{n\to\infty} \mathbb{E}_{\mathbb{P}}[X_n Z] = \mathbb{E}_{\mathbb{P}}[XZ].

Splitting a general X = X^+ - X^- into positive and negative parts extends it to any integrable X. The formula is proved.

The density integrates to one

Step 5 — take A = \Omega. The whole-space case of the defining relation is the normalisation that pins Z down. Since \mathbf{1}_\Omega = 1,

\mathbb{E}_{\mathbb{P}}[Z] = \int_\Omega Z\,d\mathbb{P} = \mathbb{Q}(\Omega) = 1,

because \mathbb{Q}, being a probability measure, assigns total mass 1. So a valid density is any non-negative Z with \mathbb{P}-mean exactly 1: it is a reweighting that neither creates nor destroys total probability. For the new measure to be equivalent (not merely absolutely continuous) we need a touch more — Z > 0 almost surely — so that the dictionary runs both ways and 1/Z = d\mathbb{P}/d\mathbb{Q}.

Let \mathbb{P} and \mathbb{Q} be probability measures on (\Omega, \mathcal{F}). Then:

\mathbb{Q} is absolutely continuous with respect to \mathbb{P} (written \mathbb{Q} \ll \mathbb{P}, meaning \mathbb{P}(A) = 0 \Rightarrow \mathbb{Q}(A) = 0) iff there is a non-negative density Z = d\mathbb{Q}/d\mathbb{P} with \mathbb{Q}(A) = \int_A Z\,d\mathbb{P}. The two measures are equivalent (\mathbb{P} \sim \mathbb{Q}) exactly when Z > 0 almost surely.
The density is normalised: \mathbb{E}_{\mathbb{P}}[Z] = 1.
Change of measure: for every integrable X, \mathbb{E}_{\mathbb{Q}}[X] = \mathbb{E}_{\mathbb{P}}[XZ].

The two notions are easy to confuse but do different jobs. \mathbb{Q} \ll \mathbb{P} (absolute continuity) is one-directional: anything \mathbb{P} rules out, \mathbb{Q} rules out too — but \mathbb{Q} may have extra impossible events of its own (those are the places where Z = 0). Equivalence is the two-sided version, \mathbb{Q} \ll \mathbb{P} and \mathbb{P} \ll \mathbb{Q} together: the same null sets, no exceptions, so Z > 0 everywhere and you can divide by it.

Statisticians know Z by another name: a likelihood ratio. If \mathbb{P} and \mathbb{Q} have densities p and q against some common reference, then Z(\omega) = \frac{d\mathbb{Q}}{d\mathbb{P}}(\omega) = \frac{q(\omega)}{p(\omega)}, the ratio of how plausible \omega is under the two hypotheses. Every time you compute a likelihood ratio you are silently writing down a Radon–Nikodym derivative.

This is the engine that the next pages run on. To build a risk-neutral world we will need a density Z that converts the real-world measure into one in which the discounted stock price is fair — and Girsanov's theorem will hand us exactly such a Z, built from the exponential martingale.

Reweighting a density, live

Take a base measure \mathbb{P} with a standard-normal density and tilt it by an exponential factor Z(x) = e^{\theta x - \theta^2/2} — exactly the shape Girsanov will use. This Z has \mathbb{P}-mean 1 (so total probability is preserved), and multiplying it onto the \mathbb{P} bell produces the \mathbb{Q} density — a normal shifted to mean \theta. Slide \theta to watch the mass slosh sideways while the area stays 1.

The faint fixed curve is the original \mathbb{P} density; the bold curve is \mathbb{Q} = Z\cdot\mathbb{P}. At \theta = 0 the tilt is Z \equiv 1 and the two coincide — no reweighting at all.