Hypothesis Testing

A hypothesis test is a courtroom for a claim. We put a default position on trial and ask whether the data give us enough reason to overturn it. The two competing claims are:

the null hypothesis H_0 — "no effect", the status quo, the thing we assume true until shown otherwise;
the alternative hypothesis H_1 — the claim of a real effect, what we suspect instead.

For a population mean, a typical pair is H_0: \mu = \mu_0 \qquad\text{versus}\qquad H_1: \mu \ne \mu_0. Like a defendant, H_0 is presumed innocent. The burden of proof is on the data.

The logic: assume, then be surprised

The move at the heart of every test is this. Assume H_0 is true. Under that assumption, the sampling distribution of the mean is known — it is a bell centred on \mu_0. Now compute a test statistic from the data and ask: how surprising is a value this extreme, if H_0 really holds?

A value near the centre is unremarkable — exactly what H_0 predicts.
A value far out in a tail would be a rare fluke under H_0.

If the observed statistic lands far enough into the tail, we judge the data too surprising to square with H_0, and we reject it in favour of H_1.

How surprising is the data?

Below is the null sampling distribution — the bell we would see if H_0 were true, drawn in z-units (standard deviations from \mu_0). Slide the observed statistic and watch where it falls. Out near the centre it is business as usual; pushed far into a tail it becomes a value H_0 rarely produces — the kind of surprise that argues against H_0.

We never prove H₀

A test has only two verdicts: reject H_0, or fail to reject H_0. The second is not a proof of H_0 — exactly as "not guilty" is not the same as "innocent". It means only that the data were not surprising enough to convict. Absence of evidence against H_0 is not evidence that H_0 is true.

H_0 (null) is the "no effect" default; H_1 (alternative) is the claim of an effect.
Assume H_0, then measure how surprising the data are under it.
Surprising data (a statistic far in the tail) ⇒ reject H_0.
"Fail to reject" ≠ "prove H_0" — we never establish the null, only fail to overturn it.