The Lambda Calculus

Imagine the smallest programming language that could possibly exist. No numbers, no loops, no strings, no if — not even a keyword. Just functions: making them, and calling them. That is the lambda calculus, invented by the logician Alonzo Church in the 1930s — a decade before the first electronic computers existed. Astonishingly, this pocket-sized language is Turing-complete: anything any computer can compute, it can compute too. Functions, all the way down.

Every functional language you might meet — Haskell, Lisp, the arrow functions in your JavaScript — is a dressed-up lambda calculus underneath. If you understand these three little rules, you understand the theoretical heart of pure functions and all of functional programming.

The whole grammar: three ways to write a term

A lambda term is built from exactly three forms — and that is the entire language. Read them out loud once and you have met the syntax:

A variable: just a name, like x or y. It stands for a value that will be supplied later.
An abstraction: \lambda x.\, M — this is a function. Read it "a function of x whose body is M." The \lambda introduces the parameter; the dot separates it from the body. So \lambda x.\, x is the identity function.
An application: M\ N — call the function M with the argument N. Just write them side by side, function first.

That is it. Everything else — numbers, booleans, arithmetic, recursion — gets built out of these three forms. A term like (\lambda x.\, x)\, y is an application of the identity abstraction to the variable y.

Two reading conventions save a mountain of brackets. Application associates to the left: f\ a\ b means (f\ a)\ b. And a \lambda's body stretches as far right as it can, so \lambda x.\, \lambda y.\, x is one function returning another — a curried two-argument function.

Computation is substitution: β-reduction

There is exactly one rule that makes the language run, called β-reduction (beta-reduction). When a function meets an argument, you substitute the argument for the parameter throughout the body:

(\lambda x.\, M)\ N \;\rightarrow\; M[x := N]

Read M[x := N] as "the body M, with every free x replaced by N." That single act of copying an argument into place is computation here — no assignment, no memory, no machine state. A term with no application of a function left to perform is in normal form: it is the answer.

Let us run one by hand. Take (\lambda x.\, \lambda y.\, x)\ a\ b — a function that ignores its second argument and returns its first. Applying left-first:

First we fed a in for x, leaving \lambda y.\, a; then we fed b in for y — but the body never mentions y, so b simply vanishes, and we are left with a. We just built and evaluated the constant / "first-of-two" function using nothing but the three forms. (Reducing until nothing is left to reduce is exactly the sort of self-similar, recursion-flavoured unwinding you have seen before.)

Free vs bound variables

Substitution has to know which variables it is allowed to touch. A variable is bound when it sits inside the body of a \lambda that names it — the \lambda "captures" it, like the way quantifiers bind a variable in logic (\forall x owns the xs that follow). A variable with no enclosing \lambda for it is free — it refers to something outside the term.

In \lambda x.\, x\ y, the x is bound (the \lambda x binds it) but y is free (nothing binds it). The name of a bound variable is irrelevant — \lambda x.\, x and \lambda z.\, z are the same function. Renaming a bound variable is called α-conversion (alpha-conversion), and it is always allowed as long as you do not accidentally clash with a name already in use.

Church's story goes that he originally wanted a "hat" over the parameter, like \hat{x}, to mark it as the bound one. That was hard for typesetters, so the hat drifted in front of the variable and became an uppercase Lambda, then a lowercase \lambda. A more cynical mathematician quipped it was chosen "for no reason at all." Either way, the Greek letter stuck, and today \lambda is shorthand for "anonymous function" in languages from Python to C#.

The single sharpest trap in the whole subject is variable capture. Substitution is not blind find-and-replace. Consider reducing (\lambda x.\, \lambda y.\, x)\ y. If you naively substitute x := y you get \lambda y.\, y — the identity function. But that is wrong: the free y we passed in got "captured" by the inner \lambda y and silently changed meaning!

The fix is α-renaming: before substituting, rename the inner bound variable to something fresh, say \lambda y'.\, x, then substitute to get \lambda y'.\, y — a constant function returning the outer y, which is correct. Rule of thumb: never substitute a term containing a free variable into a scope that binds that same name without renaming first.

A second, gentler trap: do not imagine the language has numbers or booleans "built in" and hidden somewhere. It genuinely has only the three forms. Every 0, 1, \text{true} and + below is encoded as a function.

Encoding data: Church numerals and booleans

If all you have is functions, how do you count? Church's trick: a number n is the act of applying a function f exactly n times to a starting value x. That is the Church numeral:

\overline{0} = \lambda f.\lambda x.\, x \qquad \overline{1} = \lambda f.\lambda x.\, f\,x \qquad \overline{2} = \lambda f.\lambda x.\, f\,(f\,x) \qquad \overline{n} = \lambda f.\lambda x.\, f^{\,n}\,x

So \overline{3} "means" apply f three times. Booleans get the same treatment — a boolean is a chooser that takes two options and returns one:

\text{TRUE} = \lambda a.\lambda b.\, a \qquad \text{FALSE} = \lambda a.\lambda b.\, b

Notice \text{TRUE} is exactly the "first-of-two" function we reduced earlier, and \text{FALSE} = \overline{0}! Data and code are the same stuff. We can even define arithmetic: the successor \text{succ}\ n squeezes in one extra application of f. Below, we model exactly these encodings as real higher-order functions and decode them back to ordinary integers — press Run:

// A Church numeral is a function that, given f and x, applies f to x "n" times. type Church = <T>(f: (v: T) => T) => (x: T) => T; const zero: Church = (f) => (x) => x; // apply f 0 times const one: Church = (f) => (x) => f(x); // apply f 1 time const two: Church = (f) => (x) => f(f(x)); // apply f 2 times // succ n = one more application of f than n does. const succ = (n: Church): Church => (f) => (x) => f(n(f)(x)); // add m n = do m's applications, then n's. const add = (m: Church, n: Church): Church => (f) => (x) => m(f)(n(f)(x)); // Decode: feed the numeral the "+1" function and a starting 0. const toInt = (n: Church): number => n((v: number) => v + 1)(0); console.log("zero =", toInt(zero)); // 0 console.log("one =", toInt(one)); // 1 console.log("two =", toInt(two)); // 2 console.log("three =", toInt(succ(two))); // 3 console.log("2+3 =", toInt(add(two, succ(two)))); // 5 // Church booleans pick the first or second option. const TRUE = <T>(a: T) => (b: T): T => a; const FALSE = <T>(a: T) => (b: T): T => b; console.log("TRUE picks ->", TRUE("yes")("no")); // yes console.log("FALSE picks ->", FALSE("yes")("no")); // no

Every value there is a plain function, yet toInt reads back honest integers. That is the magic of Church encoding: a language of only functions can carry all the data we want, purely by how those functions behave.

Turing-completeness needs unbounded repetition, but the lambda calculus has no while. The way out is a self-applying term called the Y combinator, Y = \lambda f.\, (\lambda x.\, f\,(x\,x))\,(\lambda x.\, f\,(x\,x)). Feed it a function and it hands that function a copy of itself to call — manufacturing recursion out of thin air, with no named definitions at all. It looks like a magic spell, and in a sense it is: it is the fixed-point that lets a nameless function refer to itself. That single term is why three tiny rules are enough to compute anything.