Sampling and Bias

A sample is only useful if it looks like the population it came from. We call such a sample representative: its mix of values mirrors the whole. The danger is a sample that systematically over- or under-counts some part of the population — then the statistic we compute points in the wrong direction, no matter how carefully we measure.

Randomness is the safeguard

The reliable way to get a representative sample is to choose it at random: give every member of the population a fair, equal chance of being picked. Randomness doesn't guarantee a perfect sample, but it removes any hidden tilt — on average the sample mean \bar{x} sits at the population mean \mu, and the only error left is ordinary chance, which shrinks as the sample grows.

Bias: a tilt no sample size can fix

Bias is a systematic error — a built-in tilt in how the sample is collected, so it consistently misses the target in the same direction. Two common culprits:

Selection bias — some members are more likely to be chosen (surveying only people at a gym about exercise habits).
Non-response bias — the people who decline differ from those who answer, quietly skewing the result.

The crucial point: bias is not cured by collecting more data. A bigger biased sample just pins down the wrong answer more precisely. Only fixing the method — making the selection fair — removes it.

Random vs biased, side by side

Switch between a random sample — highlighted points spread evenly across the population — and a biased one that only draws from the high end. Watch the sample mean \bar{x}: random keeps it near the true mean \mu; the biased sample drags it well off-target, and gathering more of the same biased points would never bring it back.

A good sample is representative — it mirrors the population.
Random sampling gives every member a fair chance, so \bar{x} centres on \mu.
Bias is a systematic tilt in how the sample is collected (selection, non-response).
A bigger biased sample is still biased — more data sharpens the wrong answer, it doesn't fix it.