Imagine you are handed a flat row of tokens — 2 + 3 * 4 — and asked to rebuild the
parse tree that a
It is the same tree either way. But "predict from the top" and "assemble from the bottom" lead to two genuinely different families of parsers — LL (recursive descent) and LR (shift-reduce) — with different strengths, and this page is about the difference.
A tidy way to remember them: think of a jigsaw. Top-down is working from the picture on the box, deciding "this must be a corner of the sky" before you have the pieces. Bottom-up is ignoring the box and just snapping together any two pieces that fit, trusting that the whole picture will emerge.
A top-down parser reads the grammar as a set of predictions. Sitting on a nonterminal, it asks: "given the next token or two of lookahead, which of this nonterminal's productions could possibly start here?" If the grammar is nice enough that one token of lookahead always answers that question uniquely, the grammar is LL(1) — the first L is "read left-to-right", the second is "build a leftmost derivation", and the 1 is "one token of lookahead".
The machinery that makes the prediction is the pair of sets FIRST and FOLLOW. Informally:
When those sets don't overlap, prediction is unambiguous and no backtracking is ever needed — the parser commits to a rule and never regrets it.
Top-down parsing expands the leftmost nonterminal at every step: it fully commits to the left branch of the tree before touching anything to its right. That sequence of expansions, starting from the start symbol, is a leftmost derivation of the input — which is exactly what the second L in LL names. (Bottom-up parsers, we'll see, trace a rightmost derivation, but backwards.)
The beautiful thing about LL parsing is that you can write it by hand, with no tables at all. The recipe — called recursive descent — is almost mechanical: turn each nonterminal of the grammar into a function, and let those functions call each other exactly the way the grammar rules refer to each other. The nesting of the calls mirrors the shape of the tree.
Take this grammar for arithmetic, laid out so that precedence and associativity are baked into its
shape — Terms (products) are grouped inside Exprs (sums), so
* binds tighter than +:
Each line becomes a function. parseExpr parses one Term, then loops while it
sees + or -; parseTerm parses one Factor, then
loops on * or /; parseFactor reads a number, or — if it sees an
open bracket — calls all the way back up to parseExpr for the sub-expression. Here is a
real, complete recursive-descent parser: a tokenizer, the three functions producing an AST,
and a tiny evaluator. Press Run:
Notice what the output proves. 2 + 3 * 4 gives 14, not 20 — the
* ends up below the + in the tree because parseTerm
greedily swallows 3 * 4 before parseExpr ever adds. And
(2 + 3) * 4 gives 20, because the brackets force
parseFactor to recurse. Left-associativity of - falls out too:
20 - 4 - 2 is 14, i.e. (20 - 4) - 2, because the
while loop keeps folding new terms onto the left. Precedence and associativity
live entirely in the structure of the functions.
You might wonder why the grammar above used that Term (('+'|'-') Term)* loop instead of
the more natural-looking
This rule is left-recursive: the very first thing Expr can expand to is
Expr again. Translate it naively into recursive descent and the first line of
parseExpr would be "call parseExpr" — with no token consumed first. The
function calls itself forever and the stack overflows. It has no lookahead to distinguish the two
alternatives, because both begin by trying to parse an Expr.
The fix is left-recursion elimination: rewrite the rule so recursion happens on the
right, after a token has been consumed. The mechanical transformation turns
A -> A α | β into a β followed by a repetition of α — which
is exactly the Term (('+'|'-') Term)* loop we used, expressed with a while.
Every recursive-descent (LL) grammar must be de-left-recursed first; it is one of the standard
limitations of the top-down approach.
Expr -> Expr '+' Term makes parseExpr call itself with no progress,
spinning into infinite recursion until the stack blows. LL grammars must have left recursion
removed first (rewrite it as a loop). This is not a problem for bottom-up LR parsers,
which handle left recursion happily.
Now flip the direction. A bottom-up parser keeps a stack and makes only two kinds of move:
The parser alternates shifting and reducing until the whole input has been consumed and the stack holds exactly the start symbol. Because every reduce recognises a completed piece and combines it, the tree grows from the leaves up. The sequence of reductions, read backwards, is a rightmost derivation — hence the R in LR.
Here is a shift-reduce parse of 2 + 3 * 4 against a small expression grammar
(E -> E + E, E -> E * E, E -> num, with *
given higher precedence so the parser knows when to hold off reducing). Read the
action column: the parser shifts a token, or reduces the handle on top of the stack.
| Stack | Remaining input | Action |
|---|---|---|
| 2 + 3 * 4 | shift 2 |
2 | + 3 * 4 | reduce E -> num |
E | + 3 * 4 | shift + |
E + | 3 * 4 | shift 3 |
E + 3 | * 4 | reduce E -> num |
E + E | * 4 | shift * (don't reduce yet — * binds tighter) |
E + E * | 4 | shift 4 |
E + E * 4 | | reduce E -> num |
E + E * E | | reduce E -> E * E |
E + E | | reduce E -> E + E |
E | | accept |
The pivotal moment is line six. The stack holds E + E and the parser could reduce
the sum right there — but the next token is *, which binds tighter, so it
shifts instead, waiting to build 3 * 4 first. That decision — reduce now,
or shift and reduce later? — is exactly what an LR parser's generated table encodes. Get it right and
2 + 3 * 4 reduces to the same tree the recursive-descent parser built, with
* nested below +.
A top-down parser must decide which rule to use before it has seen the rule's contents — it commits based on lookahead alone. A bottom-up parser gets to wait until it has already shifted the entire right-hand side onto the stack before it has to commit to reducing. It decides with the whole handle in view, not a prediction. That extra patience is precisely why LR parsers recognise a strictly larger class of grammars than LL parsers — every LL(1) grammar is LR(1), but not vice versa, and LR copes with left recursion out of the box.
In practice, LR parsers are almost never written by hand. Their tables — the states that say "in this situation, shift; in that one, reduce by rule 7" — are large and mechanical, so tools generate them. The classic ones are yacc and its modern cousins bison; the flavours you'll hear named — LR(0), SLR, LALR, canonical LR(1) — differ only in how cleverly they use lookahead to resolve shift/reduce decisions. LALR (what yacc/bison produce) is the sweet spot: nearly the power of full LR(1) with far smaller tables.
If LL is the one you can write by hand, why does every serious parser generator (yacc, bison, and for a long time the grammars behind C, C++, and countless others) reach for LR? Two reasons. First, power: LR accepts a strictly bigger class of grammars, so you can write your language's grammar in its most natural left-recursive form and hand it straight to the tool — no contortions. Second, automation: the table is generated from the grammar, so the grammar stays the single source of truth. The price you pay is error messages. When an LR parse fails it fails deep inside an opaque state machine, so "syntax error near line 42" is about all it can easily say — whereas a hand-written recursive-descent parser knows exactly which function it was in and can produce a friendly, specific message. That is why several modern compilers (notably some C/C++ and Rust front ends) have swung back to hand-written recursive descent: they trade a little grammar power for dramatically better diagnostics.