A decision tree classifies by asking a sequence of simple yes/no questions, like a flowchart: "Is feature 1 above 2? If yes, is feature 2 below 0?" Each question splits the data, and you follow the branches down to a leaf that gives the prediction.
Every question is a single axis-aligned cut — a straight slice across one feature — so the tree carves feature space into rectangular boxes, each labelled with a class. No dot products, no gradients; just "which box are you in?"
Slide the two cuts: a vertical split on feature 1 and a horizontal split on feature 2. They divide the plane into four boxes, each coloured by the majority class of the points inside it. Find cuts that put each class neatly in its own box — that's what training a tree does, automatically.
Decision trees are interpretable — you can read the rules straight off the
flowchart — and they happily mix numbers with categories, need no