The Lambda Calculus
Imagine the smallest programming language that could possibly exist. No numbers, no loops, no
strings, no if — not even a keyword. Just functions: making them, and
calling them. That is the lambda calculus, invented by the logician
Alonzo Church in the 1930s — a decade
before the first electronic computers existed. Astonishingly, this pocket-sized language is
Turing-complete: anything any computer can compute, it can compute too. Functions,
all the way down.
Every functional language you might meet — Haskell, Lisp, the arrow functions in your JavaScript — is
a dressed-up lambda calculus underneath. If you understand these three little rules, you understand
the theoretical heart of
pure
functions and all of functional programming.
The whole grammar: three ways to write a term
A lambda term is built from exactly three forms — and that is the entire
language. Read them out loud once and you have met the syntax:
-
A variable: just a name, like x or
y. It stands for a value that will be supplied later.
-
An abstraction: \lambda x.\, M — this is a
function. Read it "a function of x whose body is
M." The \lambda introduces the parameter; the
dot separates it from the body. So \lambda x.\, x is the identity
function.
-
An application: M\ N — call the function
M with the argument N. Just write them side
by side, function first.
That is it. Everything else — numbers, booleans, arithmetic, recursion — gets built out of
these three forms. A term like (\lambda x.\, x)\, y is an application of
the identity abstraction to the variable y.
Two reading conventions save a mountain of brackets. Application associates to the left:
f\ a\ b means (f\ a)\ b. And a
\lambda's body stretches as far right as it can, so
\lambda x.\, \lambda y.\, x is one function returning another — a curried
two-argument function.
Computation is substitution: β-reduction
There is exactly one rule that makes the language run, called
β-reduction (beta-reduction). When a function meets an argument, you substitute the
argument for the parameter throughout the body:
(\lambda x.\, M)\ N \;\rightarrow\; M[x := N]
Read M[x := N] as "the body M, with every free
x replaced by N." That single act of copying an
argument into place is computation here — no assignment, no memory, no machine state. A term
with no application of a function left to perform is in normal form: it is the
answer.
Let us run one by hand. Take (\lambda x.\, \lambda y.\, x)\ a\ b — a
function that ignores its second argument and returns its first. Applying left-first:
First we fed a in for x, leaving
\lambda y.\, a; then we fed b in for
y — but the body never mentions y, so
b simply vanishes, and we are left with a. We
just built and evaluated the constant / "first-of-two" function using nothing but
the three forms. (Reducing until nothing is left to reduce is exactly the sort of self-similar,
recursion-flavoured unwinding
you have seen before.)
Free vs bound variables
Substitution has to know which variables it is allowed to touch. A variable is
bound when it sits inside the body of a \lambda that
names it — the \lambda "captures" it, like the way
quantifiers
bind a variable in logic (\forall x owns the xs
that follow). A variable with no enclosing \lambda for it is
free — it refers to something outside the term.
In \lambda x.\, x\ y, the x is
bound (the \lambda x binds it) but y
is free (nothing binds it). The name of a bound variable is irrelevant —
\lambda x.\, x and \lambda z.\, z are the same
function. Renaming a bound variable is called α-conversion (alpha-conversion), and
it is always allowed as long as you do not accidentally clash with a name already in use.
Church's story goes that he originally wanted a "hat" over the parameter, like
\hat{x}, to mark it as the bound one. That was hard for typesetters, so
the hat drifted in front of the variable and became an uppercase Lambda, then a lowercase
\lambda. A more cynical mathematician quipped it was chosen "for no
reason at all." Either way, the Greek letter stuck, and today \lambda is
shorthand for "anonymous function" in languages from Python to C#.
The single sharpest trap in the whole subject is variable capture. Substitution is
not blind find-and-replace. Consider reducing
(\lambda x.\, \lambda y.\, x)\ y. If you naively substitute
x := y you get \lambda y.\, y — the
identity function. But that is wrong: the free
y we passed in got "captured" by the inner
\lambda y and silently changed meaning!
The fix is α-renaming: before substituting, rename the inner bound variable to
something fresh, say \lambda y'.\, x, then substitute to get
\lambda y'.\, y — a constant function returning the outer
y, which is correct. Rule of thumb: never substitute a term
containing a free variable into a scope that binds that same name without renaming first.
A second, gentler trap: do not imagine the language has numbers or booleans "built in" and hidden
somewhere. It genuinely has only the three forms. Every 0,
1, \text{true} and +
below is encoded as a function.
Encoding data: Church numerals and booleans
If all you have is functions, how do you count? Church's trick: a number
n is the act of applying a function
f exactly n times to a starting value
x. That is the Church numeral:
\overline{0} = \lambda f.\lambda x.\, x \qquad
\overline{1} = \lambda f.\lambda x.\, f\,x \qquad
\overline{2} = \lambda f.\lambda x.\, f\,(f\,x) \qquad
\overline{n} = \lambda f.\lambda x.\, f^{\,n}\,x
So \overline{3} "means" apply f three times.
Booleans get the same treatment — a boolean is a chooser that takes two options and returns
one:
\text{TRUE} = \lambda a.\lambda b.\, a \qquad
\text{FALSE} = \lambda a.\lambda b.\, b
Notice \text{TRUE} is exactly the "first-of-two" function we
reduced earlier, and \text{FALSE} = \overline{0}! Data and code are the
same stuff. We can even define arithmetic: the successor
\text{succ}\ n squeezes in one extra application of
f. Below, we model exactly these encodings as real higher-order functions
and decode them back to ordinary integers — press Run:
// A Church numeral is a function that, given f and x, applies f to x "n" times.
type Church = <T>(f: (v: T) => T) => (x: T) => T;
const zero: Church = (f) => (x) => x; // apply f 0 times
const one: Church = (f) => (x) => f(x); // apply f 1 time
const two: Church = (f) => (x) => f(f(x)); // apply f 2 times
// succ n = one more application of f than n does.
const succ = (n: Church): Church => (f) => (x) => f(n(f)(x));
// add m n = do m's applications, then n's.
const add = (m: Church, n: Church): Church => (f) => (x) => m(f)(n(f)(x));
// Decode: feed the numeral the "+1" function and a starting 0.
const toInt = (n: Church): number => n((v: number) => v + 1)(0);
console.log("zero =", toInt(zero)); // 0
console.log("one =", toInt(one)); // 1
console.log("two =", toInt(two)); // 2
console.log("three =", toInt(succ(two))); // 3
console.log("2+3 =", toInt(add(two, succ(two)))); // 5
// Church booleans pick the first or second option.
const TRUE = <T>(a: T) => (b: T): T => a;
const FALSE = <T>(a: T) => (b: T): T => b;
console.log("TRUE picks ->", TRUE("yes")("no")); // yes
console.log("FALSE picks ->", FALSE("yes")("no")); // no
Every value there is a plain function, yet toInt reads back honest integers. That is the
magic of Church encoding: a language of only functions can carry all the data we want, purely
by how those functions behave.
Turing-completeness needs unbounded repetition, but the lambda calculus has no while.
The way out is a self-applying term called the Y combinator,
Y = \lambda f.\, (\lambda x.\, f\,(x\,x))\,(\lambda x.\, f\,(x\,x)). Feed
it a function and it hands that function a copy of itself to call — manufacturing
recursion out of thin air, with no named definitions at all. It looks like a magic spell, and in a
sense it is: it is the fixed-point that lets a nameless function refer to itself. That single term
is why three tiny rules are enough to compute anything.