Top-Down and Bottom-Up Parsing

Imagine you are handed a flat row of tokens — 2 + 3 * 4 — and asked to rebuild the parse tree that a context-free grammar says it should have. There are two honest ways to do it, and they run in opposite directions.

Top-down. Start at the root — the grammar's start symbol — and keep guessing which rule to apply, growing the tree downward until its leaves happen to spell out your tokens. You predict the structure before you have seen most of the input.
Bottom-up. Start at the leaves — the tokens themselves — and glue small finished pieces together into bigger ones, growing the tree upward until a single root remains. You recognise structure only after you have collected its parts.

It is the same tree either way. But "predict from the top" and "assemble from the bottom" lead to two genuinely different families of parsers — LL (recursive descent) and LR (shift-reduce) — with different strengths, and this page is about the difference.

A tidy way to remember them: think of a jigsaw. Top-down is working from the picture on the box, deciding "this must be a corner of the sky" before you have the pieces. Bottom-up is ignoring the box and just snapping together any two pieces that fit, trusting that the whole picture will emerge.

Top-down: predict the tree from the root

A top-down parser reads the grammar as a set of predictions. Sitting on a nonterminal, it asks: "given the next token or two of lookahead, which of this nonterminal's productions could possibly start here?" If the grammar is nice enough that one token of lookahead always answers that question uniquely, the grammar is LL(1) — the first L is "read left-to-right", the second is "build a leftmost derivation", and the 1 is "one token of lookahead".

The machinery that makes the prediction is the pair of sets FIRST and FOLLOW. Informally:

FIRST(rule) is the set of tokens that a rule can begin with. To choose between two productions, compare the next input token against each production's FIRST set — the one it belongs to is the one to pick.
FOLLOW(nonterminal) is the set of tokens that can legally appear right after that nonterminal. You need it to decide when a nonterminal that can vanish (produce the empty string) should vanish, because the next token belongs to whatever comes next.

When those sets don't overlap, prediction is unambiguous and no backtracking is ever needed — the parser commits to a rule and never regrets it.

Top-down parsing expands the leftmost nonterminal at every step: it fully commits to the left branch of the tree before touching anything to its right. That sequence of expansions, starting from the start symbol, is a leftmost derivation of the input — which is exactly what the second L in LL names. (Bottom-up parsers, we'll see, trace a rightmost derivation, but backwards.)

Recursive descent: one function per nonterminal

The beautiful thing about LL parsing is that you can write it by hand, with no tables at all. The recipe — called recursive descent — is almost mechanical: turn each nonterminal of the grammar into a function, and let those functions call each other exactly the way the grammar rules refer to each other. The nesting of the calls mirrors the shape of the tree.

Take this grammar for arithmetic, laid out so that precedence and associativity are baked into its shape — Terms (products) are grouped inside Exprs (sums), so * binds tighter than +:

Expr -> Term (('+' | '-') Term)* Term -> Factor (('*' | '/') Factor)* Factor -> number | '(' Expr ')'

Each line becomes a function. parseExpr parses one Term, then loops while it sees + or -; parseTerm parses one Factor, then loops on * or /; parseFactor reads a number, or — if it sees an open bracket — calls all the way back up to parseExpr for the sub-expression. Here is a real, complete recursive-descent parser: a tokenizer, the three functions producing an AST, and a tiny evaluator. Press Run:

// ---- AST node shapes ---- type Node = | { kind: "num"; value: number } | { kind: "op"; op: string; left: Node; right: Node }; // ---- 1) Tokenizer: source string -> array of tokens ---- function tokenize(src: string): string[] { const tokens: string[] = []; let i = 0; while (i < src.length) { const c = src[i]; if (c === " ") { i++; continue; } if (c >= "0" && c <= "9") { // read a whole (multi-digit) number let n = ""; while (i < src.length && src[i] >= "0" && src[i] <= "9") { n += src[i]; i++; } tokens.push(n); continue; } tokens.push(c); // + - * / ( ) i++; } return tokens; } // ---- 2) Recursive-descent parser: ONE function per nonterminal ---- let toks: string[] = []; let pos = 0; const peek = () => toks[pos]; // one token of lookahead const next = () => toks[pos++]; // consume and return // Expr -> Term (('+' | '-') Term)* function parseExpr(): Node { let left = parseTerm(); while (peek() === "+" || peek() === "-") { const op = next(); const right = parseTerm(); // + and - are LEFT-associative left = { kind: "op", op, left, right }; } return left; } // Term -> Factor (('*' | '/') Factor)* function parseTerm(): Node { let left = parseFactor(); while (peek() === "*" || peek() === "/") { const op = next(); const right = parseFactor(); // binds TIGHTER than + / - left = { kind: "op", op, left, right }; } return left; } // Factor -> number | '(' Expr ')' function parseFactor(): Node { const t = peek(); if (t === "(") { next(); // eat '(' const inner = parseExpr(); // recurse all the way back to the top next(); // eat ')' return inner; } next(); // it's a number return { kind: "num", value: Number(t) }; } function parse(src: string): Node { toks = tokenize(src); pos = 0; return parseExpr(); } // ---- 3) Tiny evaluator that walks the AST ---- function evaluate(n: Node): number { if (n.kind === "num") return n.value; const l = evaluate(n.left), r = evaluate(n.right); if (n.op === "+") return l + r; if (n.op === "-") return l - r; if (n.op === "*") return l * r; return l / r; } // ---- Try it: precedence and brackets both handled by the grammar's shape ---- for (const src of ["2 + 3 * 4", "(2 + 3) * 4", "2 * 3 + 4 * 5", "20 - 4 - 2"]) { console.log(src, "=", evaluate(parse(src))); }

Notice what the output proves. 2 + 3 * 4 gives 14, not 20 — the * ends up below the + in the tree because parseTerm greedily swallows 3 * 4 before parseExpr ever adds. And (2 + 3) * 4 gives 20, because the brackets force parseFactor to recurse. Left-associativity of - falls out too: 20 - 4 - 2 is 14, i.e. (20 - 4) - 2, because the while loop keeps folding new terms onto the left. Precedence and associativity live entirely in the structure of the functions.

The catch: left recursion

You might wonder why the grammar above used that Term (('+'|'-') Term)* loop instead of the more natural-looking

Expr -> Expr '+' Term | Term

This rule is left-recursive: the very first thing Expr can expand to is Expr again. Translate it naively into recursive descent and the first line of parseExpr would be "call parseExpr" — with no token consumed first. The function calls itself forever and the stack overflows. It has no lookahead to distinguish the two alternatives, because both begin by trying to parse an Expr.

The fix is left-recursion elimination: rewrite the rule so recursion happens on the right, after a token has been consumed. The mechanical transformation turns A -> A α | β into a β followed by a repetition of α — which is exactly the Term (('+'|'-') Term)* loop we used, expressed with a while. Every recursive-descent (LL) grammar must be de-left-recursed first; it is one of the standard limitations of the top-down approach.

Left recursion is fatal to naive recursive descent. A rule like Expr -> Expr '+' Term makes parseExpr call itself with no progress, spinning into infinite recursion until the stack blows. LL grammars must have left recursion removed first (rewrite it as a loop). This is not a problem for bottom-up LR parsers, which handle left recursion happily.
Don't mix up LL and LR. LL = top-down, builds a leftmost derivation, one function per nonterminal, chokes on left recursion. LR = bottom-up, builds a rightmost derivation in reverse, shift-reduce on a stack, generated by a tool. Same second letter, opposite machines — the "L" and "R" in LR are "left-to-right scan" and "rightmost derivation".

Bottom-up: shift and reduce

Now flip the direction. A bottom-up parser keeps a stack and makes only two kinds of move:

Shift. Take the next input token and push it onto the stack. (You are gathering raw material.)
Reduce. When the top of the stack matches the right-hand side of some grammar rule — a chunk called a handle — pop that chunk and push the nonterminal it produces. (You have just recognised one finished sub-tree.)

The parser alternates shifting and reducing until the whole input has been consumed and the stack holds exactly the start symbol. Because every reduce recognises a completed piece and combines it, the tree grows from the leaves up. The sequence of reductions, read backwards, is a rightmost derivation — hence the R in LR.

Here is a shift-reduce parse of 2 + 3 * 4 against a small expression grammar (E -> E + E, E -> E * E, E -> num, with * given higher precedence so the parser knows when to hold off reducing). Read the action column: the parser shifts a token, or reduces the handle on top of the stack.

Stack	Remaining input	Action
	`2 + 3 * 4`	shift `2`
`2`	`+ 3 * 4`	reduce `E -> num`
`E`	`+ 3 * 4`	shift `+`
`E +`	`3 * 4`	shift `3`
`E + 3`	`* 4`	reduce `E -> num`
`E + E`	`* 4`	shift `` (don't reduce yet — `` binds tighter)
`E + E *`	`4`	shift `4`
`E + E * 4`		reduce `E -> num`
`E + E * E`		reduce `E -> E * E`
`E + E`		reduce `E -> E + E`
`E`		accept

The pivotal moment is line six. The stack holds E + E and the parser could reduce the sum right there — but the next token is *, which binds tighter, so it shifts instead, waiting to build 3 * 4 first. That decision — reduce now, or shift and reduce later? — is exactly what an LR parser's generated table encodes. Get it right and 2 + 3 * 4 reduces to the same tree the recursive-descent parser built, with * nested below +.

Why bottom-up accepts more grammars

A top-down parser must decide which rule to use before it has seen the rule's contents — it commits based on lookahead alone. A bottom-up parser gets to wait until it has already shifted the entire right-hand side onto the stack before it has to commit to reducing. It decides with the whole handle in view, not a prediction. That extra patience is precisely why LR parsers recognise a strictly larger class of grammars than LL parsers — every LL(1) grammar is LR(1), but not vice versa, and LR copes with left recursion out of the box.

Top-down (LL, recursive descent). Builds the tree root-to-leaves by predicting productions from lookahead (FIRST/FOLLOW). One function per nonterminal; writable by hand; traces a leftmost derivation. Cannot handle left recursion — it must be eliminated first.
Bottom-up (LR, shift-reduce). Builds the tree leaves-to-root by shifting tokens onto a stack and reducing handles by grammar rules. Traces a rightmost derivation in reverse. Table-driven and usually machine-generated; accepts a strictly larger class of grammars, left recursion included.
Same tree, opposite directions. Both recover the parse tree a context-free grammar assigns to the input — they just assemble it from opposite ends.

In practice, LR parsers are almost never written by hand. Their tables — the states that say "in this situation, shift; in that one, reduce by rule 7" — are large and mechanical, so tools generate them. The classic ones are yacc and its modern cousins bison; the flavours you'll hear named — LR(0), SLR, LALR, canonical LR(1) — differ only in how cleverly they use lookahead to resolve shift/reduce decisions. LALR (what yacc/bison produce) is the sweet spot: nearly the power of full LR(1) with far smaller tables.

If LL is the one you can write by hand, why does every serious parser generator (yacc, bison, and for a long time the grammars behind C, C++, and countless others) reach for LR? Two reasons. First, power: LR accepts a strictly bigger class of grammars, so you can write your language's grammar in its most natural left-recursive form and hand it straight to the tool — no contortions. Second, automation: the table is generated from the grammar, so the grammar stays the single source of truth. The price you pay is error messages. When an LR parse fails it fails deep inside an opaque state machine, so "syntax error near line 42" is about all it can easily say — whereas a hand-written recursive-descent parser knows exactly which function it was in and can produce a friendly, specific message. That is why several modern compilers (notably some C/C++ and Rust front ends) have swung back to hand-written recursive descent: they trade a little grammar power for dramatically better diagnostics.