Hypothesis Testing

A hypothesis test is a courtroom for a claim. We put a default position on trial and ask whether the data give us enough reason to overturn it. The two competing claims are:

For a population mean, a typical pair is H_0: \mu = \mu_0 \qquad\text{versus}\qquad H_1: \mu \ne \mu_0. Like a defendant, H_0 is presumed innocent. The burden of proof is on the data.

The logic: assume, then be surprised

The move at the heart of every test is this. Assume H_0 is true. Under that assumption, the sampling distribution of the mean is known — it is a bell centred on \mu_0. Now compute a test statistic from the data and ask: how surprising is a value this extreme, if H_0 really holds?

If the observed statistic lands far enough into the tail, we judge the data too surprising to square with H_0, and we reject it in favour of H_1.

How surprising is the data?

Below is the null sampling distribution — the bell we would see if H_0 were true, drawn in z-units (standard deviations from \mu_0). Slide the observed statistic and watch where it falls. Out near the centre it is business as usual; pushed far into a tail it becomes a value H_0 rarely produces — the kind of surprise that argues against H_0.

We never prove H₀

A test has only two verdicts: reject H_0, or fail to reject H_0. The second is not a proof of H_0 — exactly as "not guilty" is not the same as "innocent". It means only that the data were not surprising enough to convict. Absence of evidence against H_0 is not evidence that H_0 is true.