Significance and the t-Test

How small must a p-value be before we reject H_0? Rather than argue case by case, we fix a threshold in advance: the significance level \alpha. The rule is then mechanical,

\text{reject } H_0 \iff p < \alpha,

with \alpha = 0.05 the most common choice. Committing to \alpha before seeing the data stops us from moving the goalposts to fit the result. \alpha is also the rate at which we would wrongly reject a true H_0 — a deliberately chosen risk.

The t-statistic

To test a mean we need a statistic. Standardising the sample mean gives the t-statistic

t = \frac{\bar{x} - \mu_0}{\mathrm{SE}}, \qquad \mathrm{SE} = \frac{s}{\sqrt{n}}.

It is just a z-score for the sample mean: how many standard errors the observed mean \bar{x} sits from the value \mu_0 claimed by H_0. The one twist is the s in the denominator: we rarely know the true \sigma, so we use the sample's own standard deviation s as a stand-in.

Why the tails get heavier

That substitution has a price. Because s is itself an estimate — and a jumpy one when n is small — the statistic t scatters more widely than a true z-score would. So t does not follow the standard normal; it follows the t-distribution, which is bell-shaped and symmetric but has heavier tails — more probability stranded far from the centre.

The t-distribution is indexed by its degrees of freedom \nu = n - 1. Small \nu means a small, noisy sample and the fattest tails; as \nu grows the estimate s settles down and the t-distribution approaches the standard normal. Slide the degrees of freedom and watch the heavy-tailed curve tighten onto the bell.

Putting it together

A t-test: compute t = (\bar{x}-\mu_0)/\mathrm{SE}, find its p-value from the t-distribution with n-1 degrees of freedom, and reject H_0 when p < \alpha. For large n the t- and z-tests nearly coincide; for small n the heavier t-tails demand a more extreme t before they will let you reject — a built-in penalty for small data.