Binary Search Trees
Binary search is
gloriously fast — but it demands a sorted array, and keeping an array sorted
as new items arrive is painful: to slot a value into the middle you must shuffle everything
after it along by one. So we get quick searching but slow inserting. Can we
have both?
Yes — if we stop storing the order in a straight row and start storing it in the
shape of a tree. A binary tree is a collection of
nodes, each holding a value and up to two children: a
left child and a right child. One node at the very top is
the root; a node with no children is a leaf. That's just the
container. A binary search tree (BST) adds one golden rule that turns the
container into a searching machine.
For every node in the tree:
- every value in its left subtree is smaller than the node's value;
- every value in its right subtree is larger than the node's value.
This must hold at each node, not just the root — so the whole tree is soaked in order. And
that order is exactly what lets us throw away half the tree at every step, just like binary
search does to an array.
Meet a binary search tree
Below is a BST built from the values
50, 30, 70, 20, 40, 60, 80. Check the golden rule anywhere you
like: at the root 50, everything to the left
(30, 20, 40) is smaller and everything to the right
(70, 60, 80) is larger; at 70, its left
child 60 is smaller and its right child 80
is larger. Now watch a search for 60 — step through and see how each
comparison lets us ignore an entire branch.
We reached 60 after visiting only three nodes —
50, 70, 60 —
never looking at 30, 20, 40 or 80 at all.
Each comparison told us which way to turn and let us discard everything down the other path.
Searching, recursively
Because a BST is made of smaller BSTs — each subtree is itself a valid BST — searching
is naturally recursive. To
look for a target starting at some node, do one of three things:
- if the node is empty (we walked off the bottom of the tree), the target
simply isn't here — stop;
- if the node's value equals the target, we've found it — stop;
- otherwise compare: if the target is smaller, search the
left child; if larger, search the right
child. Either way we recurse on a smaller subtree.
Here it is in TypeScript. We give each node a value and two child links (which are
null when there's no child), and the search logs every node it visits. Press
Run:
type BstNode = { value: number; left: BstNode | null; right: BstNode | null };
// A tiny pre-built tree: 50
// 30 70
// 20 40 60 80
const leaf = (v: number): BstNode => ({ value: v, left: null, right: null });
const root: BstNode = {
value: 50,
left: { value: 30, left: leaf(20), right: leaf(40) },
right: { value: 70, left: leaf(60), right: leaf(80) },
};
function search(node: BstNode | null, target: number): boolean {
if (node === null) { // walked off the tree
console.log(` not found — ${target} is not in the tree`);
return false;
}
console.log(` visiting ${node.value}`);
if (target === node.value) { // found it
console.log(` found ${target}!`);
return true;
}
if (target < node.value) {
return search(node.left, target); // smaller → go LEFT
} else {
return search(node.right, target); // larger → go RIGHT
}
}
console.log("Searching for 60:");
search(root, 60);
console.log("Searching for 25:");
search(root, 25);
Read the log: hunting for 60 visits
50 \to 70 \to 60 and stops; hunting for the absent
25 visits 50 \to 30 \to 20 and then
walks off an empty branch, reporting "not found".
Inserting, recursively
Inserting uses the same comparison logic — you search for where the value
ought to be, and when you fall off the bottom of the tree (reach an empty spot), that
empty spot is exactly where the new leaf belongs. This is why a BST gives us fast inserts as
well as fast searches: dropping in a new value is just one walk down from the root.
type BstNode = { value: number; left: BstNode | null; right: BstNode | null };
// Insert `value`, returning the (possibly new) root of this subtree.
function insert(node: BstNode | null, value: number): BstNode {
if (node === null) { // empty spot → the new leaf goes here
return { value, left: null, right: null };
}
if (value < node.value) {
node.left = insert(node.left, value); // smaller → build it into the LEFT
} else {
node.right = insert(node.right, value); // larger → build it into the RIGHT
}
return node;
}
// Build a tree by inserting values one at a time into an empty tree (null).
let root: BstNode | null = null;
for (const v of [50, 30, 70, 20, 40, 60, 80]) {
root = insert(root, v);
}
// Read it back in order (left subtree, node, right subtree) — a BST trick:
function inOrder(node: BstNode | null): void {
if (node === null) return;
inOrder(node.left);
console.log(node.value);
inOrder(node.right);
}
console.log("The tree, read left-node-right:");
inOrder(root);
The inOrder walk visits a node's whole left subtree first, then the node,
then its whole right subtree. But the BST rule says everything on the left is smaller
and everything on the right is larger — so this order is guaranteed to hand back the values
smallest to largest, no matter what order they were inserted. A BST is,
secretly, a sorted list you can also insert into quickly. (This is the heart of an algorithm
called tree sort: insert everything, then read it back in order.)
Why it's fast: about \log_2 n steps
Each comparison in a search sends us down to exactly one child and lets us ignore the other
child's entire subtree. If the tree is balanced — roughly the same depth on
both sides — then each step throws away about half the remaining nodes, precisely as
binary search halves an array. So a balanced BST of n nodes is only
about \log_2 n levels deep, and search and insert each cost roughly
\log_2 n steps.
\text{balanced BST: search} \approx \log_2 n \quad\text{steps}
The numbers are the same jaw-dropping ones as binary search: a balanced tree of
1000 values is found in about 10 steps
(because 2^{10} = 1024); a million values, about
20 steps. Doubling the data adds just one extra
level. That is the whole reason BSTs (and their balanced cousins, and database indexes built
on the same idea) are everywhere.
The lovely \log_2 n speed relies entirely on the tree staying
reasonably balanced — and it isn't guaranteed. Watch what happens if you
insert values that are already sorted, say
10, 20, 30, 40, 50: every new value is larger than the last, so it
always goes right. The tree never branches — it collapses into a lopsided chain:
This is a degenerate BST, and it's really just a
linked list in disguise. There's no "other half" to throw away at each step,
so searching it means walking down every node — back to a slow n
comparisons, not \log_2 n. The ordering rule is still perfectly
obeyed; it's the shape that ruined the speed.
The fix is to keep the tree balanced. Inserting values in a shuffled order usually keeps a
tree roughly balanced by luck; and for a cast-iron guarantee, computer scientists invented
self-balancing trees (such as AVL and red–black trees) that quietly
rearrange themselves after each insert to stay shallow. So: a BST is only fast if it stays
bushy, not stringy.