Statistics
Statistics is the art of learning from data. The world hands us messy,
partial, noisy numbers — exam scores, rainfall, click-through rates, the heights of a thousand
people — and statistics is the disciplined set of tools for turning that mess into honest
conclusions: a single number that summarises, a curve that describes, and a
verdict that says how much we should believe it.
It splits naturally into two halves. Descriptive statistics compresses a pile
of data into a few telling numbers and pictures. Inferential statistics goes
further and dangerous: it uses a small sample to make claims about a whole
population we never fully see — and, crucially, it keeps track of how wrong those
claims might be.
The big idea: signal through noise
One thread runs through the whole subject. Every measurement is part signal (the thing
we want to know) and part noise (the accidents of which particular data we happened to
collect). Statistics is the machinery for separating the two — and for being honest about the
noise that remains. Get a single number from data, and the very next question is always the
same: how much would it have changed if I'd collected different data?
The shape of the journey
The course climbs in seven stages, each building on the last.
- Stage 1 — Describing data. Centre and spread: the mean, median, variance,
standard deviation, quartiles and the histogram.
- Stage 2 — Distributions. The idealised shape data settles into — above all
the normal bell curve, its rule of thumb, and the z-score.
- Stage 3 — Sampling. From a sample to a population: bias, the sampling
distribution, standard error, and the central limit theorem.
- Stage 4 — Estimation. Turning a sample into a guess with a margin: point
estimates and confidence intervals.
- Stage 5 — Hypothesis testing. Weighing a claim against the data: null and
alternative, the p-value, the t-test,
and the two ways to be wrong.
- Stage 6 — Relationships. How two variables move together: scatter plots,
correlation, and the line of best fit.
- Stage 7 — Bayesian inference. The other way to reason under uncertainty —
updating a belief as evidence arrives.
Stage 1 — Describing data
- Data and Variables
- The Mean
- The Median
- The Mode
- Range and Spread
- Variance
- Standard Deviation
- Quartiles and the IQR
- Histograms
- Shape, Skew, and Outliers
Stage 2 — Distributions
- What Is a Distribution?
- The Normal Distribution
- The Empirical Rule
- z-Scores
Stage 3 — Sampling
- Population and Sample
- Sampling and Bias
- The Sampling Distribution of the Mean
- Standard Error
- The Central Limit Theorem
Stage 4 — Estimation
- Point Estimates
- Confidence Intervals
- A Confidence Interval for a Mean
Stage 5 — Hypothesis testing
- Hypothesis Testing
- The p-Value
- Significance and the t-Test
- Type I and Type II Errors
Stage 6 — Relationships
- Scatter Plots
- Correlation
- The Regression Line
- Interpreting Slope and Intercept
Stage 7 — Bayesian inference
The classical course above asks "how surprising is this data, if the claim were true?" The
Bayesian view flips the question to "how should this data update what I believe?" Both
are statistics; the second is a short arc you can take now or save for last.
- Bayes' Theorem
- Likelihood and MLE
- The Covariance Matrix
- The Multivariate Gaussian
- MAP Estimation
Let's get started
We begin where all statistics begins — with the raw material. Before any average or curve, you
have to know what kind of thing you are measuring.
Let's get started → Data and Variables