Variance & Moments
The expectation pins down the centre of a random variable, but says nothing about
how far it strays. The moments measure that and more. The
n-th moment is simply the expectation of the
n-th power:
\mathbb{E}[X^{n}] \;=\; \int_{\Omega} X^{n}\, d\mathbb{P}.
The first moment is the mean \mu = \mathbb{E}[X]; the second moment
\mathbb{E}[X^{2}] feeds the spread. Each moment captures one more
shred of information about the
distribution.
Variance: the spread around the mean
The variance is the expected squared distance from the mean — the average
of how far X lands from its own centre, squared so that
overshoots and undershoots both count:
\operatorname{Var}(X) \;=\; \mathbb{E}\big[(X - \mu)^{2}\big].
Expanding the square and using
linearity of expectation
gives the formula you will reach for every time:
\operatorname{Var}(X) \;=\; \mathbb{E}[X^{2}] - \mu^{2}.
Because variance is in squared units, we usually quote its square root, the
standard deviation \sigma = \sqrt{\operatorname{Var}(X)},
which lives in the same units as X itself.
How it scales
Shifting a random variable leaves its spread untouched, and stretching it scales the spread
by the square of the stretch:
\operatorname{Var}(aX + b) \;=\; a^{2}\,\operatorname{Var}(X).
The +b slides the whole distribution but keeps every point the same
distance from the (now shifted) mean, so it drops out; the a pulls
all those distances apart by a, and squaring inside the expectation
turns that into a^{2}.
Seeing the spread
Both bells below are centred on the same mean \mu = 0. The faint
curve is fixed at \sigma = 1; the bold one tracks the slider. Push
\sigma up and the bold bell flattens and widens —
its area stays 1, so spreading it out must lower its peak. Variance
is exactly this width made precise.