k-Means

k-means finds k clusters with a loop so simple it's almost magic. Drop k centre points anywhere, then repeat two steps until nothing moves:

  1. Assign. Colour each data point by its nearest centre.
  2. Update. Move each centre to the average position of the points assigned to it.

Assign, update, assign, update — each pass tightens the clusters, and within a handful of rounds the centres settle right into the heart of each clump.

Watch the centres settle

The two centres start in poor positions. Step the algorithm: points recolour to their nearest centre, then each centre slides to the middle of its colour. Repeat, and the centres glide into the two clusters and lock in place — the moment nothing changes, k-means has converged.

Simple, fast, imperfect

k-means is quick and scales to huge datasets, which is why it's the go-to clusterer. Its quirks: you must choose k up front, the result depends on where the centres start (so it's run several times and the best kept), and it assumes roughly round, similar-sized blobs — it struggles with long or tangled shapes. Even so, it's the first clustering tool to reach for, and a perfect example of how a trivial loop can uncover real structure.