Building an Interpreter

Here is a secret that feels like magic until you see it up close: a language is just a program that runs other programs. When you type 2 * (3 + 4) into a calculator, or let x = 6 in x*x + 1 into a scripting REPL, some ordinary code reads your text, works out what it means, and hands back an answer. That code is an interpreter — and the simplest kind, a tree-walking interpreter, is small enough to build in about thirty lines.

This page is the capstone that ties four ideas together: syntax and semantics, trees, tree traversals, and recursion. By the end you will have written a real evaluator — a single recursive function that walks a program's tree and computes its value.

The pipeline: from text to value

Running a program is a short assembly line. Your source code starts as one long string of characters and ends as a value. In between are three classic stages:

Tokenise (lex). Chop the raw text into tokens — the words and symbols of the language. "3 + 4" becomes [3] [+] [4]. Whitespace disappears.
Parse. Arrange the flat token stream into a tree that captures the structure — an Abstract Syntax Tree (AST). Precedence and brackets decide the shape: 2 * (3 + 4) puts * at the root with + nested inside.
Evaluate. Walk that tree and compute the answer. This is our focus.

The parser's job (text → tree) and the evaluator's job (tree → value) are genuinely separate steps, and keeping them apart is what makes each one simple. Once the parser has done its work, the evaluator never sees a single character of source text again — it only ever sees the tidy tree.

An interpreter walks the tree and produces the answer directly, right now. A compiler walks the same tree but instead emits new code — machine instructions, or bytecode — which some other machine runs later. Same front end (tokenise, parse), different back end. Python and Ruby lean on tree-walking interpreters; C and Rust compile. Many modern languages do both: interpret first for a quick start, then compile the hot paths. Everything on this page is the interpreter side.

The tree is the program

Because the AST is a tree, every node is one of a handful of shapes, and each shape knows how to combine its children. Leaves carry raw data — a number literal like 6, or a variable name like x. Internal nodes carry operations — a + node has a left child and a right child; a let node binds a name, then evaluates a body.

Here is the AST for the body of our sample program, x*x + 1. To find its value we visit the nodes in post-order: evaluate a node's children first, then combine. The numbered badges show the order the evaluator touches each node when x = 6.

The evaluator, in one recursive function

Now the payoff. We model an expression as a union type — a value that is exactly one of our node shapes — and write evaluate, a function that recurses over the tree. It pattern-matches on the node kind: a literal returns its value, a variable is looked up in the environment, an operation evaluates both children then combines, and a let evaluates its bound value, extends the environment, and evaluates its body. Press Run:

// An expression is exactly one of four shapes — the nodes of our tree. type Expr = | { kind: "lit"; value: number } // a number literal (a leaf) | { kind: "var"; name: string } // a variable reference (a leaf) | { kind: "op"; op: "+" | "-" | "*"; left: Expr; right: Expr } // a binary operation | { kind: "let"; name: string; value: Expr; body: Expr }; // let name = value in body // The environment maps variable names to their current values. type Env = Record<string, number>; // evaluate: walk the tree and return a number. Post-order — children FIRST, then combine. function evaluate(e: Expr, env: Env): number { switch (e.kind) { case "lit": // a leaf: hand back its value return e.value; case "var": // a leaf: look the name up in the environment return env[e.name]; case "op": { // evaluate BOTH children, THEN combine them const l = evaluate(e.left, env); const r = evaluate(e.right, env); if (e.op === "+") return l + r; if (e.op === "-") return l - r; return l * r; } case "let": { // evaluate the bound value, extend env, eval the body const v = evaluate(e.value, env); const inner: Env = { ...env, [e.name]: v }; // a NEW env with x bound return evaluate(e.body, inner); } } } // AST 1: 2 * (3 + 4) const a: Expr = { kind: "op", op: "*", left: { kind: "lit", value: 2 }, right: { kind: "op", op: "+", left: { kind: "lit", value: 3 }, right: { kind: "lit", value: 4 } }, }; console.log("2 * (3 + 4) =", evaluate(a, {})); // 14 // AST 2: let x = 6 in x*x + 1 (a let-binding and two variable look-ups) const b: Expr = { kind: "let", name: "x", value: { kind: "lit", value: 6 }, body: { kind: "op", op: "+", left: { kind: "op", op: "*", left: { kind: "var", name: "x" }, right: { kind: "var", name: "x" } }, right: { kind: "lit", value: 1 }, }, }; console.log("let x = 6 in x*x + 1 =", evaluate(b, {})); // 37

That is the entire idea. evaluate calls itself on each child, so the shape of the recursion mirrors the shape of the tree exactly — a post-order traversal where the "combine" step at each node is the meaning (the semantics) of that operation. The environment is threaded down through the calls; a let hands its children a richer environment with one more name bound.

Growing the language: adding `if`

Adding a feature to a tree-walker is delightfully local: invent a new node shape, then add one case to evaluate. Here we drop the environment for brevity and add a comparison (<, yielding 1 or 0) plus an if-expression that evaluates its condition and then walks only the chosen branch. Press Run:

type Expr = | { kind: "lit"; value: number } | { kind: "op"; op: "+" | "*" | "<"; left: Expr; right: Expr } | { kind: "if"; cond: Expr; then: Expr; else: Expr }; function evaluate(e: Expr): number { switch (e.kind) { case "lit": return e.value; case "op": { const l = evaluate(e.left), r = evaluate(e.right); if (e.op === "+") return l + r; if (e.op === "*") return l * r; return l < r ? 1 : 0; // "<" gives 1 (true) or 0 (false) } case "if": // Evaluate the condition, then walk ONLY the branch we picked. return evaluate(e.cond) !== 0 ? evaluate(e.then) : evaluate(e.else); } } // if (3 < 5) then 10*10 else 0 const prog: Expr = { kind: "if", cond: { kind: "op", op: "<", left: { kind: "lit", value: 3 }, right: { kind: "lit", value: 5 } }, then: { kind: "op", op: "*", left: { kind: "lit", value: 10 }, right: { kind: "lit", value: 10 } }, else: { kind: "lit", value: 0 }, }; console.log("if (3 < 5) then 10*10 else 0 =>", evaluate(prog)); // 100

Notice the if case does not evaluate both branches — it picks one. That is a real semantic choice: control flow is exactly "which subtree do I walk?" You have now built a language with arithmetic, comparison, variables, scope, and branching, and it is still tiny.

A program is stored as an Abstract Syntax Tree: leaves are literals and variables, internal nodes are operations.
Evaluation is a single recursive function that pattern-matches on each node's kind and returns a value.
It visits nodes in post-order — a node's children are evaluated before the node combines them — so the recursion mirrors the tree.
An environment threaded through the calls carries the values of variables; a binding form (let) extends it for its body.

Because that is literally all it does: it takes a walk through the tree. There are faster designs — compile the tree to bytecode and run that on a virtual machine, or JIT-compile hot code to native instructions — and production language runtimes use them. But every one of those still begins by tokenising and parsing into a tree, and the tree-walker is the clearest way to see what a program means. It is the interpreter you write first, and the one you reach for when you embed a little scripting language inside a bigger app.

Evaluate children BEFORE you combine. For an op node you must evaluate both subtrees first, then apply the operator — that is what "post-order" means. Trying to combine before the children are computed is the classic tree-walking bug.
Thread the environment; don't lose the scope. A let must evaluate its body in an environment extended with the new binding. If you forget to pass the richer environment down, a later var look-up fails — the variable "doesn't exist", even though you just bound it.
Don't confuse the parser's job with the evaluator's. Turning text into a tree (parsing) and turning a tree into a value (evaluating) are separate stages. The evaluator never touches source characters, precedence, or brackets — the parser already resolved all of that into the tree's shape.
An if should not evaluate both branches. Evaluate the condition, then walk only the taken branch — otherwise side effects (or infinite recursion) in the untaken branch would fire when they shouldn't.

The pipeline: from text to value

The tree is the program

The evaluator, in one recursive function

Growing the language: adding if

Growing the language: adding `if`