Type Inference

You write this and never mention a single type:

const double = (x) => x + x;

Yet a compiler like TypeScript, Haskell or ML will tell you, with total confidence, that double is a function from number to number. Nobody typed the word "number". So where did that knowledge come from? The compiler deduced it — from the one clue available: + works on numbers, so x must be a number, so the result is a number too. That detective work is type inference, and this page is about how the trick is actually done.

The classic recipe is Hindley–Milner (HM), the engine underneath ML, Haskell, OCaml, F#, Rust's local inference and much of TypeScript. Its idea is beautifully mechanical: read the code, collect equations between types, then solve them. No guessing, no "usually a number" — a small piece of algebra that always lands on the most general answer.

Three moves: fresh variables, constraints, unification

HM runs the same three moves every time:

A substitution is just a mapping like \{\alpha \mapsto \texttt{Int},\ \beta \mapsto \texttt{Bool}\}. It is, quite literally, a function from type variables to types — which is why the maths of relations and functions is lurking right under the hood.

A worked example, step by step

Let's infer the type of double by hand, exactly the way HM would. We treat + as an operation on \texttt{Int} ( (+) : \texttt{Int} \to \texttt{Int} \to \texttt{Int}).

  1. Give the parameter a fresh variable: x has type \alpha. The whole function therefore has type \alpha \to \beta for some result type \beta.
  2. Look at the body x + x. Since + demands \texttt{Int} operands, each use of x generates the constraint \alpha = \texttt{Int}.
  3. The result of + is an \texttt{Int}, and that is what the function returns, so \beta = \texttt{Int}.
  4. Unify. The substitution \{\alpha \mapsto \texttt{Int},\ \beta \mapsto \texttt{Int}\} satisfies every equation. Apply it to \alpha \to \beta and read off:
\texttt{double} : \texttt{Int} \to \texttt{Int}

No type annotation was ever written, yet the type is pinned down completely — and it is fully static, decided at compile time, before the program runs even once.

Unification, in code

Unification is the heart of the whole thing, and it is small enough to write in one sitting. Represent a type as either a type variable { v: "a" } or a constructor { con: "Arrow", args: [...] } — where Arrow holds the argument and result types, and Int/Bool are constructors with no arguments. unify walks two types together, growing a substitution \sigma as it discovers what each variable must be. Press Run:

type Type = { v: string } | { con: string; args: Type[] }; const V = (v: string): Type => ({ v }); const C = (con: string, args: Type[] = []): Type => ({ con, args }); const Int = C("Int"), Bool = C("Bool"); const Arrow = (a: Type, b: Type): Type => C("Arrow", [a, b]); // Follow a chain of variable bindings to the real type it stands for. function walk(t: Type, sub: Map<string, Type>): Type { while ("v" in t && sub.has(t.v)) t = sub.get(t.v)!; return t; } // Does variable `name` appear inside type `t`? (the OCCURS CHECK) function occurs(name: string, t: Type, sub: Map<string, Type>): boolean { t = walk(t, sub); if ("v" in t) return t.v === name; return t.args.some((a) => occurs(name, a, sub)); } // Make x and y equal by extending the substitution `sub` in place. function unify(x: Type, y: Type, sub: Map<string, Type>): boolean { x = walk(x, sub); y = walk(y, sub); if ("v" in x) { // x is a variable if ("v" in y && y.v === x.v) return true; if (occurs(x.v, y, sub)) return false; // would build an infinite type! sub.set(x.v, y); // bind x := y return true; } if ("v" in y) return unify(y, x, sub); // symmetric case // both are constructors: names and arity must match, then unify args pairwise if (x.con !== y.con || x.args.length !== y.args.length) return false; return x.args.every((a, i) => unify(a, y.args[i], sub)); } function show(t: Type, sub: Map<string, Type>): string { t = walk(t, sub); if ("v" in t) return t.v; if (t.con === "Arrow") return "(" + show(t.args[0], sub) + " -> " + show(t.args[1], sub) + ")"; return t.con + (t.args.length ? "(" + t.args.map((a) => show(a, sub)).join(", ") + ")" : ""); } // Example 1: unify (a -> Int) with (Bool -> b) const s1 = new Map<string, Type>(); const ok1 = unify(Arrow(V("a"), Int), Arrow(Bool, V("b")), s1); console.log("example 1 unifies:", ok1); console.log(" a =", show(V("a"), s1), " b =", show(V("b"), s1)); // Example 2 (occurs check): unify a with (a -> Int) const s2 = new Map<string, Type>(); const ok2 = unify(V("a"), Arrow(V("a"), Int), s2); console.log("example 2 unifies:", ok2, "(rejected: infinite type a = a -> Int)");

Example 1 succeeds with \{\alpha \mapsto \texttt{Bool},\ \beta \mapsto \texttt{Int}\}: line up \alpha \to \texttt{Int} against \texttt{Bool} \to \beta, match argument to argument and result to result, done. Example 2 is refused by the occurs check — more on that below.

The most general answer: principal types

Unification never invents a needlessly specific answer. When it binds a variable, it binds it to the least that the constraints demand and leaves everything else free. That gives HM its signature property: the type it finds is the principal type — the single most general type from which every other valid type is a special case.

Take the identity function (x) => x. Nothing is ever done to x, so no constraint forces \alpha to be anything. It stays a free variable, and the inferred type is:

\texttt{id} : \alpha \to \alpha

This is polymorphism falling straight out of the algebra: id works at \texttt{Int} \to \texttt{Int}, at \texttt{Bool} \to \texttt{Bool}, at any type at all, because the free \alpha can be specialised to whatever each caller needs. Binding a type to a let and letting each use pick its own specialisation is called let-polymorphism, or generalisation — the last flourish of the HM system.

The algorithm has two parents who never worked together. Roger Hindley (a logician) showed in 1969 that terms in the typed lambda calculus have a principal type. Robin Milner, building the ML language in 1978, independently invented the very same inference procedure — his famous Algorithm W — and proved it sound with the slogan "well-typed programs cannot go wrong." The two discoveries were later recognised as the same idea, so the system carries both names. (Milner went on to win the Turing Award.)

Two traps snare almost everyone meeting inference for the first time: