Statistics

Statistics is the art of learning from data. The world hands us messy, partial, noisy numbers — exam scores, rainfall, click-through rates, the heights of a thousand people — and statistics is the disciplined set of tools for turning that mess into honest conclusions: a single number that summarises, a curve that describes, and a verdict that says how much we should believe it.

It splits naturally into two halves. Descriptive statistics compresses a pile of data into a few telling numbers and pictures. Inferential statistics goes further and dangerous: it uses a small sample to make claims about a whole population we never fully see — and, crucially, it keeps track of how wrong those claims might be.

The big idea: signal through noise

One thread runs through the whole subject. Every measurement is part signal (the thing we want to know) and part noise (the accidents of which particular data we happened to collect). Statistics is the machinery for separating the two — and for being honest about the noise that remains. Get a single number from data, and the very next question is always the same: how much would it have changed if I'd collected different data?

The shape of the journey

The course climbs in seven stages, each building on the last.

Stage 1 — Describing data. Centre and spread: the mean, median, variance, standard deviation, quartiles and the histogram.
Stage 2 — Distributions. The idealised shape data settles into — above all the normal bell curve, its rule of thumb, and the z-score.
Stage 3 — Sampling. From a sample to a population: bias, the sampling distribution, standard error, and the central limit theorem.
Stage 4 — Estimation. Turning a sample into a guess with a margin: point estimates and confidence intervals.
Stage 5 — Hypothesis testing. Weighing a claim against the data: null and alternative, the p-value, the t-test, and the two ways to be wrong.
Stage 6 — Relationships. How two variables move together: scatter plots, correlation, and the line of best fit.
Stage 7 — Bayesian inference. The other way to reason under uncertainty — updating a belief as evidence arrives.

Stage 1 — Describing data

Data and Variables
The Mean
The Median
The Mode
Range and Spread
Variance
Standard Deviation
Quartiles and the IQR
Histograms
Shape, Skew, and Outliers

Stage 2 — Distributions

What Is a Distribution?
The Normal Distribution
The Empirical Rule
z-Scores

Stage 3 — Sampling

Population and Sample
Sampling and Bias
The Sampling Distribution of the Mean
Standard Error
The Central Limit Theorem

Stage 4 — Estimation

Point Estimates
Confidence Intervals
A Confidence Interval for a Mean

Stage 5 — Hypothesis testing

Hypothesis Testing
The p-Value
Significance and the t-Test
Type I and Type II Errors

Stage 6 — Relationships

Scatter Plots
Correlation
The Regression Line
Interpreting Slope and Intercept

Stage 7 — Bayesian inference

The classical course above asks "how surprising is this data, if the claim were true?" The Bayesian view flips the question to "how should this data update what I believe?" Both are statistics; the second is a short arc you can take now or save for last.

Bayes' Theorem
Likelihood and MLE
The Covariance Matrix
The Multivariate Gaussian
MAP Estimation

Let's get started

We begin where all statistics begins — with the raw material. Before any average or curve, you have to know what kind of thing you are measuring.

Let's get started → Data and Variables