When two classes of points can be separated by a straight line, there are usually
infinitely many lines that do the job. A
Step through the idea — the same points, then the maximum-margin boundary, then what actually determines it:
The margin is the strip of empty space on either side of the boundary, reaching out to the nearest points. An SVM chooses the boundary that makes this strip as wide as possible — a "maximum-margin classifier." A wide margin means the boundary is committed, robust, and generalises better to unseen data.
Here's the striking part: only the handful of points sitting right on the edge of the margin matter. Those are the support vectors — they alone "hold up" the boundary. Every other point could be moved (or deleted) without changing the answer at all. The whole model is pinned by just a few critical examples.
Real data isn't always perfectly separable, so a soft-margin SVM allows a few
points to sit inside the margin or on the wrong side, trading a little error for a wider, more robust
boundary. A parameter
What if no straight line can separate the classes — say one class forms a ring around the other? The SVM's secret weapon is the kernel trick: map the points into a higher-dimensional space where they do become linearly separable, find the flat boundary there, and it curves back into a nonlinear boundary in the original space.
The magic is that you never actually compute those high-dimensional coordinates. An SVM only ever
needs dot products between points, and a kernel function
Picture red points in a circle surrounded by blue points — no line separates them in the plane.
Now add a third coordinate