k-Nearest Neighbours

Here's a classifier with no training at all. To label a new point, k-nearest neighbours (k-NN) just looks at the k closest examples in the training data and takes a majority vote of their labels. "You are like your neighbours." That's the whole algorithm.

"Closest" means smallest distance in feature space — the length of the difference between the two feature vectors. So once data is vectors, k-NN is almost free to describe.

Ask the neighbours

Move the grey query point through the data and choose how many neighbours k get a vote. The k nearest examples light up, and the query is coloured by their majority. Notice how the prediction can flip as you cross between the two clouds — or as you change k.

Simple, but with a catch

k-NN is wonderfully intuitive and makes no assumptions about the shape of the boundary. But it's lazy in a costly way: every prediction must measure the distance to every training point, which is slow on big data. It's also highly sensitive to feature scales — a reason feature scaling matters — and to the choice of k: too small and it chases noise, too large and it blurs the classes together.