Hypothesis Testing
A hypothesis test is a courtroom for a claim. We put a default position on
trial and ask whether the data give us enough reason to overturn it. The two competing claims
are:
-
the null hypothesis H_0 — "no effect", the
status quo, the thing we assume true until shown otherwise;
-
the alternative hypothesis H_1 — the claim of a
real effect, what we suspect instead.
For a population mean, a typical pair is
H_0: \mu = \mu_0 \qquad\text{versus}\qquad H_1: \mu \ne \mu_0.
Like a defendant, H_0 is presumed innocent. The burden of proof is
on the data.
The logic: assume, then be surprised
The move at the heart of every test is this. Assume
H_0 is true. Under that assumption, the
sampling distribution of the mean
is known — it is a bell centred on \mu_0. Now compute a
test statistic from the data and ask: how surprising is a value this
extreme, if H_0 really holds?
- A value near the centre is unremarkable — exactly what H_0 predicts.
- A value far out in a tail would be a rare fluke under H_0.
If the observed statistic lands far enough into the tail, we judge the data
too surprising to square with H_0, and we
reject it in favour of H_1.
How surprising is the data?
Below is the null sampling distribution — the bell we would see if
H_0 were true, drawn in z-units (standard
deviations from \mu_0). Slide the observed statistic
and watch where it falls. Out near the centre it is business as usual; pushed far into a tail
it becomes a value H_0 rarely produces — the kind of surprise that
argues against H_0.
We never prove H₀
A test has only two verdicts: reject H_0, or
fail to reject H_0. The second is not a
proof of H_0 — exactly as "not guilty" is not the same as "innocent".
It means only that the data were not surprising enough to convict. Absence of evidence against
H_0 is not evidence that H_0 is true.
- H_0 (null) is the "no effect" default; H_1 (alternative) is the claim of an effect.
- Assume H_0, then measure how surprising the data are under it.
- Surprising data (a statistic far in the tail) ⇒ reject H_0.
- "Fail to reject" ≠ "prove H_0" — we never establish the null, only fail to overturn it.