The Central Limit Theorem
Here is the result that makes statistics work. Take a population of any shape —
skewed, lumpy, spiky, it doesn't matter. Draw a sample of size n and
record its mean \bar{x}. The central limit theorem
says that as n grows, the distribution of
\bar{x} becomes approximately normal — a clean,
symmetric bell — no matter how un-bell-like the population was.
What it promises
Combine the central limit theorem with what we already know about the sampling distribution of
the mean. For large n,
\bar{x} \;\approx\; N\!\left(\mu,\ \frac{\sigma^2}{n}\right):
it is centred at \mu, has standard error
\sigma/\sqrt{n}, and — the new part — is approximately
normal in shape. The promise is about the shape of the distribution of
\bar{x}, not about the individual values, which keep the population's
original messy shape.
How large is "large"? A common rule of thumb is n \gtrsim 30. The more
skewed the population, the larger the n you need; for a population
already close to normal, even a small n will do.
Why the normal is everywhere
This is the reason the bell curve shows up across nature, measurement, and finance. Anything that
is itself an average or a sum of many small independent contributions —
measurement errors, heights, total returns — inherits a near-normal distribution from the
central limit theorem, regardless of the messy mechanisms underneath.
Watch the bell emerge
The faint curve is a deliberately skewed population — lopsided, with a long
right tail. The bold curve is the distribution of the sample mean
\bar{x}. At n=1 it echoes the skew; raise
n and it pulls into a symmetric, narrow bell centred
at \mu, with width \sigma/\sqrt{n} — exactly
as the theorem promises.
- For large n, the distribution of \bar{x} is approximately normal.
- This holds whatever the population's shape — skewed, bimodal, anything (with finite variance).
- The bell is centred at \mu with standard error \sigma/\sqrt{n}.
- Rule of thumb: n \gtrsim 30 is usually enough; this is why the normal appears everywhere.