Machine Learning

Ordinary programming means writing down the rules yourself: if this, do that. But how do you write the rules for recognising a cat, or a spam email, or a friend's handwriting? Nobody can. Machine learning flips the problem on its head — instead of coding the rules, you show the computer thousands of examples and let it learn the rules for itself.

That single shift powers almost everything that feels magical about modern computing: voice assistants, recommendations, translation, self-driving cars and the large language models you might be reading this with. Underneath the magic, though, it's all built from ideas you can actually understand — lines, slopes, distances and dot products.

The maths it's made of

Machine learning is where your maths comes alive. A data point is a vector; a model's prediction is a dot product; a layer of a neural network is a matrix times a vector; and "learning" itself is just rolling downhill on an error surface using the slope — the derivative. If you've worked through linear algebra, you already hold most of the keys.

The shape of the journey

This course climbs in seven stages, each building on the last.

Stage A — Foundations. What learning from data even means: features, labels, training and the all-important idea of generalizing to new data.
Stage B — Linear regression. The "hello world" of ML: fit a line, measure its error, and roll downhill with gradient descent to make it better.
Stage C — Classification. From predicting numbers to predicting categories: the sigmoid, logistic regression, decision boundaries and nearest neighbours.
Stage D — Trees & ensembles. A completely different, rule-based style of model: decision trees and the forests built from them.
Stage E — Fitting & evaluation. The central craft of ML: overfitting, the bias–variance tradeoff, regularization, and how to score a model honestly.
Stage F — Neural networks. Stack simple neurons into deep networks, push data forward, and train them by backpropagation.
Stage G — Unsupervised learning. Finding structure with no labels at all: clustering and dimensionality reduction.

Stage A — Foundations

What Is Machine Learning?
Supervised vs Unsupervised
Features and Labels
The Feature Vector
The Training Loop
The Dataset: Train and Test
Generalization

Stage B — Linear regression

Fitting a Line
The Hypothesis Function
The Cost Function
Visualizing the Cost
Gradient Descent
The Learning Rate
Multiple Features
The Normal Equation
Feature Scaling

Stage C — Classification

Classification
The Sigmoid Function
Logistic Regression
The Decision Boundary
Cross-Entropy Loss
k-Nearest Neighbours
Multiclass Classification

Stage D — Trees & ensembles

Decision Trees
Entropy and Information Gain
Overfitting a Tree
Random Forests

Stage E — Fitting & evaluation

Overfitting and Underfitting
The Bias–Variance Tradeoff
Regularization
Train, Validation, Test
Accuracy, Precision, Recall

Stage F — Neural networks

The Neuron
Activation Functions
A Layer of Neurons
Stacking Layers
Forward Propagation
The Loss Landscape
Backpropagation
Training a Network
Why Go Deep?

Stage G — Unsupervised learning

Clustering
k-Means
Dimensionality Reduction
Principal Component Analysis

Let's get started

We begin at the very beginning — what it actually means for a machine to "learn" anything at all, and why that's such a powerful idea.

Let's get started → What Is Machine Learning?