k-Nearest Neighbours
Here's a classifier with no training at all. To label a new point,
k-nearest neighbours (k-NN) just looks at the k
closest examples in the training data and takes a majority vote of their
labels. "You are like your neighbours." That's the whole algorithm.
"Closest" means smallest
distance
in feature space — the length of the difference between the two feature vectors. So once data is
vectors, k-NN is almost free to describe.
Ask the neighbours
Move the grey query point through the data and choose how many neighbours
k get a vote. The k nearest examples light
up, and the query is coloured by their majority. Notice how the prediction can flip as you cross
between the two clouds — or as you change k.
Simple, but with a catch
k-NN is wonderfully intuitive and makes no assumptions about the shape of the boundary. But it's
lazy in a costly way: every prediction must measure the distance to every training
point, which is slow on big data. It's also highly sensitive to feature scales — a reason
feature
scaling matters — and to the choice of k: too small and
it chases noise, too large and it blurs the classes together.