Big-O notation
You have written two programs that both solve the same problem, and both give the right answer.
One finishes in a blink; the other grinds for a minute. Which is "better"? You already know that
the honest answer isn't a number of seconds — that just measures whose laptop is newer. The
useful question is the one we met when we studied
: as the input
gets bigger, how fast does the work grow?
Big-O notation is the language we use to answer that precisely. It is a compact
way to write down the growth rate of an algorithm — how the number of steps scales with
the input size n — while deliberately throwing away the details that
stop mattering once n is large. In this one page you'll learn exactly
what gets thrown away and why, meet the handful of growth classes that cover almost everything you
will ever analyse, and see where the "worst case" convention comes from.
The two rules: drop constants, keep the biggest term
Suppose you count the steps an algorithm takes on an input of size n
and get an exact formula, say
T(n) = 3n + 5.
Big-O simplifies this to O(n) by applying two rules, in this order:
- Keep only the dominant term. The term that grows fastest is the only one
that survives, because for large n it dwarfs the rest. In
3n + 5 the 3n term races away while the
+5 stays put, so the +5 is discarded.
- Drop the constant multiplier. A leading factor like the
3 doesn't change the shape of the growth — a line is a line
whether it climbs at 3 per step or 1 — so
it is dropped too.
What remains, n, we wrap in a big O(\;) and
read aloud as "order n". A few more examples of the same machine at
work:
exact step count Big-O why
3n + 5 O(n) biggest term is n; drop 3 and +5
100n O(n) still just a line; drop the 100
n²/2 + 8n + 20 O(n²) n² beats n beats a constant
7 O(1) no n at all — constant work
6n log n + 4n O(n log n) n log n grows faster than n
2ⁿ + n⁴ O(2ⁿ) an exponential beats any power of n
The point of all this discarding is not laziness — it's fairness. The
3 and the +5 depend on your language, your
compiler and your machine; the fact that the growth is linear does not. Big-O keeps the
part that is a property of the algorithm itself.
Put numbers to it. At n = 10, the formula
3n + 5 gives 35, and the
+5 is a chunky 14\% of that. At
n = 1000 it gives 3005, and the
+5 is 0.17\%. At a million it is a rounding
error. The constant never gets bigger, but the input does — so any fixed add-on becomes
invisible. That "vanishing as n \to \infty" is precisely what Big-O
formalises: 3n + 5 and n are the same
order because their ratio settles to a constant as n grows.
The growth classes you'll meet
Almost every algorithm you analyse at this level falls into one of six growth classes. Here they
are from best (flattest) to worst (steepest), each with a mental picture and a real example:
- O(1) — constant. The work is the same no matter
how big the input is. Example: reading
list[0], or looking a value up in a
hash table. The dream.
- O(\log n) — logarithmic. Each step throws away a
fixed fraction (usually half) of what's left, so doubling n adds just
one step. Example: binary search on a sorted list.
- O(n) — linear. Work grows in lock-step with the
input; double the data, double the work. Example: a single pass through a list to find
the largest value, or linear search.
- O(n \log n) — linearithmic. A touch worse than
linear but still excellent, and the best you can do for a general-purpose sort.
Example: merge sort, quicksort (on average).
- O(n^2) — quadratic. Typically a loop inside a
loop. Double the data and the work goes up fourfold. Example: bubble sort, or
comparing every pair of items in a list.
- O(2^n) — exponential. Adding just one to
the input doubles the work. Becomes hopeless astonishingly quickly. Example:
trying every subset of a set, or naive recursive Fibonacci.
To feel how violently these separate, look at the step counts for a single, fairly small input of
n = 64:
order steps at n = 64 feels like
O(1) 1 instant
O(log n) 6 instant
O(n) 64 instant
O(n log n) 384 instant
O(n²) 4,096 a flicker
O(2ⁿ) 18,446,744,073,709,551,616 longer than the age of the universe
That last row is not a typo. An O(2^n) algorithm on an input of size
64 would still be running when the Sun burns out. This is why the
growth class, not the constant, is the thing worth arguing about.
See the growth
The chart plots number of steps against input size n for the common
orders. Notice how, over on the far left for tiny n, the curves are all
bunched together — but as you look rightwards \log n barely lifts off
the floor, n climbs steadily, n \log n pulls
a little above it, and n^2 rockets off the top of the frame almost
immediately.
The whole subject in one sentence: for small inputs the lines are close together, but for
large inputs the growth rate decides the winner by a mile. Choose the algorithm with the
flatter curve and you win as n grows.
Count the operations yourself
Let's make the difference between O(n) and
O(n^2) concrete by counting real operations. Both functions
below answer questions about the same list of n numbers.
hasDuplicateSlow compares every pair of items (a loop inside a loop —
O(n^2)); largest makes a single pass
(O(n)). Each keeps a tally of how many comparisons it performs. Press
Run and compare the counts.
function largest(list: number[]): number {
let ops = 0;
let best = list[0];
for (let i = 1; i < list.length; i++) {
ops++; // one comparison per item -> O(n)
if (list[i] > best) best = list[i];
}
return ops;
}
function hasDuplicateSlow(list: number[]): number {
let ops = 0;
for (let i = 0; i < list.length; i++) {
for (let j = i + 1; j < list.length; j++) {
ops++; // one comparison per PAIR -> O(n^2)
if (list[i] === list[j]) { /* found a duplicate */ }
}
}
return ops;
}
for (const n of [10, 20, 40, 80]) {
const list = Array.from({ length: n }, (_, i) => i + 1);
console.log(`n = ${n}: O(n) did ${largest(list)} ops, O(n^2) did ${hasDuplicateSlow(list)} ops`);
}
Look at what happens each time n doubles: the
O(n) count roughly doubles too, but the
O(n^2) count roughly quadruples. That multiplying-by-four when
you multiply the input by two is the fingerprint of quadratic growth — and it is exactly what the
n^2 in the Big-O predicts. Try adding 160 to the list of
sizes and watch the gap widen.
Best, worst, and average case
The same algorithm can do different amounts of work on different inputs of the same size.
Think about linear search for a target in a list of n items:
- Best case — the target is the very first item, so we stop after
1 comparison. That's O(1).
- Worst case — the target is last, or missing entirely, so we check all
n items. That's O(n).
- Average case — over all the places the target could be, we look at about
half the list on average, n/2 comparisons. Drop the constant and
that is also O(n).
So which one do we quote? Almost always the worst case. There are two good
reasons. First, it's a guarantee: "this will never be slower than
O(n)" is a promise you can build on, whereas "it might be fast if you
get lucky" is not. Second, the best case is often uselessly optimistic — nearly every search has
an O(1) best case, so it can't tell two algorithms apart. When
engineers design systems that must not fall over, they plan for the bad day, not the lucky one.
- An unqualified Big-O — "linear search is O(n)" — means the
worst case unless stated otherwise.
- The worst case gives an upper-bound guarantee on running time: the
algorithm will never take longer than that order.
- The average case can differ and is sometimes quoted separately (e.g. quicksort is
O(n \log n) on average but O(n^2) in the
worst case).
Big-O for space, not just time
Everything so far has counted steps — that's time complexity. But the
very same notation describes how much extra memory an algorithm needs as
n grows — its space complexity. Summing a list by
keeping one running total uses a fixed amount of extra memory however long the list is, so it is
O(1) space. Making a reversed copy of a list needs room for
n more items, so it is O(n) space. Time and
space are often traded against each other — a faster algorithm may "spend" memory to buy speed —
so it pays to state which one you mean.
Big-O is a statement about growth as n gets large —
it is not a prediction of exact running time, and it deliberately hides constants and
all the small-n behaviour. So it is perfectly possible for an
O(n^2) algorithm to beat an
O(n \log n) one for small inputs, where the hidden constants and
set-up overhead dominate. A tidy little insertion sort (O(n^2)) often
outruns merge sort (O(n \log n)) on a list of a dozen items, which is
why real sorting libraries switch to a simple sort for tiny chunks.
Two more traps to sidestep. First, O(n^2) beating
O(n \log n) at n = 10 is not a
contradiction — Big-O only claims the n \log n curve wins once
n is big enough. Second, don't confuse the notation with the
case: writing O(n) tells you the growth shape, while
"worst case" tells you which input you measured. An algorithm can be quoted as
O(1) best case and O(n) worst case — both
are Big-O, just for different scenarios.