Fitting a Line

Linear regression is the "hello world" of machine learning: predict a number from features by fitting a straight line through the data. Simple as it is, it contains the whole skeleton of supervised learning — a model, an error to minimize, and a way to improve — so it's the perfect place to start.

With one feature x the model is just a line. Once it's fitted, you feed in a new x and read off the predicted y — that's the payoff.

Fit it, then predict

Adjust the slope and intercept to fit the points, watching the total error fall. Then look at the green query marker: once your line is good, it predicts a sensible y for an x that isn't in the data at all — which is the entire reason we bothered to fit a line.

The plan from here

"Adjust until it fits" is fine by hand, but we need to make it precise and automatic. That takes three pieces: a model with tunable knobs, a cost that scores how wrong it is, and an algorithm — gradient descent — that turns the knobs to drive the cost down. We build them one at a time.