Distribution of a Random Variable
A random variable X : \Omega \to \mathbb{R} carries the
probability living on the abstract space \Omega over onto the
real line. The distribution (or law) of
X is the pushforward measure
\mathbb{P}_X on \mathbb{R}:
\mathbb{P}_X(B) \;=\; \mathbb{P}\!\left(X \in B\right) \;=\; \mathbb{P}\!\left(\{\omega : X(\omega) \in B\}\right).
Once we have \mathbb{P}_X we can forget about
\Omega entirely — every probabilistic question about
X is answered on the real line. This is why the
random variable
is the bridge: it transports the
measure to where we can compute with it.
The cumulative distribution function
It is awkward to specify a measure on every Borel set B, so we
encode the whole law in one function. The cumulative distribution function
(CDF) accumulates probability up to a point:
F(x) \;=\; \mathbb{P}(X \le x) \;=\; \mathbb{P}_X\big((-\infty,\,x]\big).
A function is a CDF exactly when it is
- non-decreasing — more room can only let in more probability;
- right-continuous — F(x) = \lim_{t \downarrow x} F(t), a consequence of (-\infty, x] = \bigcap_n (-\infty, x + \tfrac1n];
- normalised at the ends: \lim_{x \to -\infty} F(x) = 0 and \lim_{x \to +\infty} F(x) = 1.
Probabilities of half-open intervals fall straight out by subtraction:
\mathbb{P}(a < X \le b) \;=\; F(b) - F(a).
Two flavours: discrete and continuous
When X lands on a countable set of values the law is described
by a probability mass function (PMF)
p(x) = \mathbb{P}(X = x), and the CDF is a staircase that jumps by
p(x) at each value:
F(x) = \sum_{x_k \le x} p(x_k), \qquad \sum_k p(x_k) = 1.
When F is instead smooth, the law has a
probability density function (PDF) f \ge 0 with
F(x) = \int_{-\infty}^{x} f(t)\,dt, \qquad \int_{-\infty}^{\infty} f(t)\,dt = 1.
Here a single point carries no probability (\mathbb{P}(X = x) = 0),
so {<} and {\le} coincide.
A worked staircase: the sum of two dice
Roll two fair dice and let X be their sum. The PMF is the
familiar triangle (the sum 7 is most likely, with
p(7) = \tfrac{6}{36}), so the CDF is a staircase:
flat between integers, jumping by p(k) at each
k, rising from 0 up to
1. Read off any interval probability by subtraction, e.g.
\mathbb{P}(4 < X \le 7) = F(7) - F(4).