Big-O notation

You have written two programs that both solve the same problem, and both give the right answer. One finishes in a blink; the other grinds for a minute. Which is "better"? You already know that the honest answer isn't a number of seconds — that just measures whose laptop is newer. The useful question is the one we met when we studied : as the input gets bigger, how fast does the work grow?

Big-O notation is the language we use to answer that precisely. It is a compact way to write down the growth rate of an algorithm — how the number of steps scales with the input size n — while deliberately throwing away the details that stop mattering once n is large. In this one page you'll learn exactly what gets thrown away and why, meet the handful of growth classes that cover almost everything you will ever analyse, and see where the "worst case" convention comes from.

The two rules: drop constants, keep the biggest term

Suppose you count the steps an algorithm takes on an input of size n and get an exact formula, say

T(n) = 3n + 5.

Big-O simplifies this to O(n) by applying two rules, in this order:

Keep only the dominant term. The term that grows fastest is the only one that survives, because for large n it dwarfs the rest. In 3n + 5 the 3n term races away while the +5 stays put, so the +5 is discarded.
Drop the constant multiplier. A leading factor like the 3 doesn't change the shape of the growth — a line is a line whether it climbs at 3 per step or 1 — so it is dropped too.

What remains, n, we wrap in a big O(\;) and read aloud as "order n". A few more examples of the same machine at work:

exact step count Big-O why 3n + 5 O(n) biggest term is n; drop 3 and +5 100n O(n) still just a line; drop the 100 n²/2 + 8n + 20 O(n²) n² beats n beats a constant 7 O(1) no n at all — constant work 6n log n + 4n O(n log n) n log n grows faster than n 2ⁿ + n⁴ O(2ⁿ) an exponential beats any power of n

The point of all this discarding is not laziness — it's fairness. The 3 and the +5 depend on your language, your compiler and your machine; the fact that the growth is linear does not. Big-O keeps the part that is a property of the algorithm itself.

Put numbers to it. At n = 10, the formula 3n + 5 gives 35, and the +5 is a chunky 14\% of that. At n = 1000 it gives 3005, and the +5 is 0.17\%. At a million it is a rounding error. The constant never gets bigger, but the input does — so any fixed add-on becomes invisible. That "vanishing as n \to \infty" is precisely what Big-O formalises: 3n + 5 and n are the same order because their ratio settles to a constant as n grows.

The growth classes you'll meet

Almost every algorithm you analyse at this level falls into one of six growth classes. Here they are from best (flattest) to worst (steepest), each with a mental picture and a real example:

O(1) — constant. The work is the same no matter how big the input is. Example: reading list[0], or looking a value up in a hash table. The dream.
O(\log n) — logarithmic. Each step throws away a fixed fraction (usually half) of what's left, so doubling n adds just one step. Example: binary search on a sorted list.
O(n) — linear. Work grows in lock-step with the input; double the data, double the work. Example: a single pass through a list to find the largest value, or linear search.
O(n \log n) — linearithmic. A touch worse than linear but still excellent, and the best you can do for a general-purpose sort. Example: merge sort, quicksort (on average).
O(n^2) — quadratic. Typically a loop inside a loop. Double the data and the work goes up fourfold. Example: bubble sort, or comparing every pair of items in a list.
O(2^n) — exponential. Adding just one to the input doubles the work. Becomes hopeless astonishingly quickly. Example: trying every subset of a set, or naive recursive Fibonacci.

To feel how violently these separate, look at the step counts for a single, fairly small input of n = 64:

order steps at n = 64 feels like O(1) 1 instant O(log n) 6 instant O(n) 64 instant O(n log n) 384 instant O(n²) 4,096 a flicker O(2ⁿ) 18,446,744,073,709,551,616 longer than the age of the universe

That last row is not a typo. An O(2^n) algorithm on an input of size 64 would still be running when the Sun burns out. This is why the growth class, not the constant, is the thing worth arguing about.

See the growth

The chart plots number of steps against input size n for the common orders. Notice how, over on the far left for tiny n, the curves are all bunched together — but as you look rightwards \log n barely lifts off the floor, n climbs steadily, n \log n pulls a little above it, and n^2 rockets off the top of the frame almost immediately.

The whole subject in one sentence: for small inputs the lines are close together, but for large inputs the growth rate decides the winner by a mile. Choose the algorithm with the flatter curve and you win as n grows.

Count the operations yourself

Let's make the difference between O(n) and O(n^2) concrete by counting real operations. Both functions below answer questions about the same list of n numbers. hasDuplicateSlow compares every pair of items (a loop inside a loop — O(n^2)); largest makes a single pass (O(n)). Each keeps a tally of how many comparisons it performs. Press Run and compare the counts.

Look at what happens each time n doubles: the O(n) count roughly doubles too, but the O(n^2) count roughly quadruples. That multiplying-by-four when you multiply the input by two is the fingerprint of quadratic growth — and it is exactly what the n^2 in the Big-O predicts. Try adding 160 to the list of sizes and watch the gap widen.

Best, worst, and average case

The same algorithm can do different amounts of work on different inputs of the same size. Think about linear search for a target in a list of n items:

Best case — the target is the very first item, so we stop after 1 comparison. That's O(1).
Worst case — the target is last, or missing entirely, so we check all n items. That's O(n).
Average case — over all the places the target could be, we look at about half the list on average, n/2 comparisons. Drop the constant and that is also O(n).

So which one do we quote? Almost always the worst case. There are two good reasons. First, it's a guarantee: "this will never be slower than O(n)" is a promise you can build on, whereas "it might be fast if you get lucky" is not. Second, the best case is often uselessly optimistic — nearly every search has an O(1) best case, so it can't tell two algorithms apart. When engineers design systems that must not fall over, they plan for the bad day, not the lucky one.

An unqualified Big-O — "linear search is O(n)" — means the worst case unless stated otherwise.
The worst case gives an upper-bound guarantee on running time: the algorithm will never take longer than that order.
The average case can differ and is sometimes quoted separately (e.g. quicksort is O(n \log n) on average but O(n^2) in the worst case).

Big-O for space, not just time

Everything so far has counted steps — that's time complexity. But the very same notation describes how much extra memory an algorithm needs as n grows — its space complexity. Summing a list by keeping one running total uses a fixed amount of extra memory however long the list is, so it is O(1) space. Making a reversed copy of a list needs room for n more items, so it is O(n) space. Time and space are often traded against each other — a faster algorithm may "spend" memory to buy speed — so it pays to state which one you mean.

Big-O is a statement about growth as n gets large — it is not a prediction of exact running time, and it deliberately hides constants and all the small-n behaviour. So it is perfectly possible for an O(n^2) algorithm to beat an O(n \log n) one for small inputs, where the hidden constants and set-up overhead dominate. A tidy little insertion sort (O(n^2)) often outruns merge sort (O(n \log n)) on a list of a dozen items, which is why real sorting libraries switch to a simple sort for tiny chunks.

Two more traps to sidestep. First, O(n^2) beating O(n \log n) at n = 10 is not a contradiction — Big-O only claims the n \log n curve wins once n is big enough. Second, don't confuse the notation with the case: writing O(n) tells you the growth shape, while "worst case" tells you which input you measured. An algorithm can be quoted as O(1) best case and O(n) worst case — both are Big-O, just for different scenarios.