Variance & Moments

The expectation pins down the centre of a random variable, but says nothing about how far it strays. The moments measure that and more. The n-th moment is simply the expectation of the n-th power:

\mathbb{E}[X^{n}] \;=\; \int_{\Omega} X^{n}\, d\mathbb{P}.

The first moment is the mean \mu = \mathbb{E}[X]; the second moment \mathbb{E}[X^{2}] feeds the spread. Each moment captures one more shred of information about the distribution.

Variance: the spread around the mean

The variance is the expected squared distance from the mean — the average of how far X lands from its own centre, squared so that overshoots and undershoots both count:

\operatorname{Var}(X) \;=\; \mathbb{E}\big[(X - \mu)^{2}\big].

Expanding the square and using linearity of expectation gives the formula you will reach for every time:

\operatorname{Var}(X) \;=\; \mathbb{E}[X^{2}] - \mu^{2}.

Because variance is in squared units, we usually quote its square root, the standard deviation \sigma = \sqrt{\operatorname{Var}(X)}, which lives in the same units as X itself.

How it scales

Shifting a random variable leaves its spread untouched, and stretching it scales the spread by the square of the stretch:

\operatorname{Var}(aX + b) \;=\; a^{2}\,\operatorname{Var}(X).

The +b slides the whole distribution but keeps every point the same distance from the (now shifted) mean, so it drops out; the a pulls all those distances apart by a, and squaring inside the expectation turns that into a^{2}.

Seeing the spread

Both bells below are centred on the same mean \mu = 0. The faint curve is fixed at \sigma = 1; the bold one tracks the slider. Push \sigma up and the bold bell flattens and widens — its area stays 1, so spreading it out must lower its peak. Variance is exactly this width made precise.