Type Inference

You write this and never mention a single type:

const double = (x) => x + x;

Yet a compiler like TypeScript, Haskell or ML will tell you, with total confidence, that double is a function from number to number. Nobody typed the word "number". So where did that knowledge come from? The compiler deduced it — from the one clue available: + works on numbers, so x must be a number, so the result is a number too. That detective work is type inference, and this page is about how the trick is actually done.

The classic recipe is Hindley–Milner (HM), the engine underneath ML, Haskell, OCaml, F#, Rust's local inference and much of TypeScript. Its idea is beautifully mechanical: read the code, collect equations between types, then solve them. No guessing, no "usually a number" — a small piece of algebra that always lands on the most general answer.

Three moves: fresh variables, constraints, unification

HM runs the same three moves every time:

Assign fresh type variables. Every unknown gets a placeholder — call them \alpha, \beta, \gamma, \dots. "I don't know x's type yet, so let it be \alpha."
Gather constraints. Each way a value is used forces an equation between types. Using x with + forces \alpha = \texttt{Int}. Applying a function f to an argument forces f's type to be an arrow \text{(arg type)} \to \text{(result type)}.
Solve by unification. Unification makes two type expressions equal by finding a substitution — an assignment of concrete types to variables — that turns one into the other. Solve all the equations at once and read the answer off the variables.

A substitution is just a mapping like \{\alpha \mapsto \texttt{Int},\ \beta \mapsto \texttt{Bool}\}. It is, quite literally, a function from type variables to types — which is why the maths of relations and functions is lurking right under the hood.

A worked example, step by step

Let's infer the type of double by hand, exactly the way HM would. We treat + as an operation on \texttt{Int} ( (+) : \texttt{Int} \to \texttt{Int} \to \texttt{Int}).

Give the parameter a fresh variable: x has type \alpha. The whole function therefore has type \alpha \to \beta for some result type \beta.
Look at the body x + x. Since + demands \texttt{Int} operands, each use of x generates the constraint \alpha = \texttt{Int}.
The result of + is an \texttt{Int}, and that is what the function returns, so \beta = \texttt{Int}.
Unify. The substitution \{\alpha \mapsto \texttt{Int},\ \beta \mapsto \texttt{Int}\} satisfies every equation. Apply it to \alpha \to \beta and read off:

\texttt{double} : \texttt{Int} \to \texttt{Int}

No type annotation was ever written, yet the type is pinned down completely — and it is fully static, decided at compile time, before the program runs even once.

Unification, in code

Unification is the heart of the whole thing, and it is small enough to write in one sitting. Represent a type as either a type variable { v: "a" } or a constructor { con: "Arrow", args: [...] } — where Arrow holds the argument and result types, and Int/Bool are constructors with no arguments. unify walks two types together, growing a substitution \sigma as it discovers what each variable must be. Press Run:

type Type = { v: string } | { con: string; args: Type[] }; const V = (v: string): Type => ({ v }); const C = (con: string, args: Type[] = []): Type => ({ con, args }); const Int = C("Int"), Bool = C("Bool"); const Arrow = (a: Type, b: Type): Type => C("Arrow", [a, b]); // Follow a chain of variable bindings to the real type it stands for. function walk(t: Type, sub: Map<string, Type>): Type { while ("v" in t && sub.has(t.v)) t = sub.get(t.v)!; return t; } // Does variable `name` appear inside type `t`? (the OCCURS CHECK) function occurs(name: string, t: Type, sub: Map<string, Type>): boolean { t = walk(t, sub); if ("v" in t) return t.v === name; return t.args.some((a) => occurs(name, a, sub)); } // Make x and y equal by extending the substitution `sub` in place. function unify(x: Type, y: Type, sub: Map<string, Type>): boolean { x = walk(x, sub); y = walk(y, sub); if ("v" in x) { // x is a variable if ("v" in y && y.v === x.v) return true; if (occurs(x.v, y, sub)) return false; // would build an infinite type! sub.set(x.v, y); // bind x := y return true; } if ("v" in y) return unify(y, x, sub); // symmetric case // both are constructors: names and arity must match, then unify args pairwise if (x.con !== y.con || x.args.length !== y.args.length) return false; return x.args.every((a, i) => unify(a, y.args[i], sub)); } function show(t: Type, sub: Map<string, Type>): string { t = walk(t, sub); if ("v" in t) return t.v; if (t.con === "Arrow") return "(" + show(t.args[0], sub) + " -> " + show(t.args[1], sub) + ")"; return t.con + (t.args.length ? "(" + t.args.map((a) => show(a, sub)).join(", ") + ")" : ""); } // Example 1: unify (a -> Int) with (Bool -> b) const s1 = new Map<string, Type>(); const ok1 = unify(Arrow(V("a"), Int), Arrow(Bool, V("b")), s1); console.log("example 1 unifies:", ok1); console.log(" a =", show(V("a"), s1), " b =", show(V("b"), s1)); // Example 2 (occurs check): unify a with (a -> Int) const s2 = new Map<string, Type>(); const ok2 = unify(V("a"), Arrow(V("a"), Int), s2); console.log("example 2 unifies:", ok2, "(rejected: infinite type a = a -> Int)");

Example 1 succeeds with \{\alpha \mapsto \texttt{Bool},\ \beta \mapsto \texttt{Int}\}: line up \alpha \to \texttt{Int} against \texttt{Bool} \to \beta, match argument to argument and result to result, done. Example 2 is refused by the occurs check — more on that below.

The most general answer: principal types

Unification never invents a needlessly specific answer. When it binds a variable, it binds it to the least that the constraints demand and leaves everything else free. That gives HM its signature property: the type it finds is the principal type — the single most general type from which every other valid type is a special case.

Take the identity function (x) => x. Nothing is ever done to x, so no constraint forces \alpha to be anything. It stays a free variable, and the inferred type is:

\texttt{id} : \alpha \to \alpha

This is polymorphism falling straight out of the algebra: id works at \texttt{Int} \to \texttt{Int}, at \texttt{Bool} \to \texttt{Bool}, at any type at all, because the free \alpha can be specialised to whatever each caller needs. Binding a type to a let and letting each use pick its own specialisation is called let-polymorphism, or generalisation — the last flourish of the HM system.

Every well-typed expression in the HM system has a principal type: one most general type that all its other valid types are instances of.
That principal type is found by unification, and unification finds the most general unifier — the substitution that commits to the least it possibly can.
The result needs no type annotations from the programmer, yet is entirely static.

The algorithm has two parents who never worked together. Roger Hindley (a logician) showed in 1969 that terms in the typed lambda calculus have a principal type. Robin Milner, building the ML language in 1978, independently invented the very same inference procedure — his famous Algorithm W — and proved it sound with the slogan "well-typed programs cannot go wrong." The two discoveries were later recognised as the same idea, so the system carries both names. (Milner went on to win the Turing Award.)

Two traps snare almost everyone meeting inference for the first time:

Unification needs the occurs check. Suppose you try to unify \alpha with \alpha \to \texttt{Int}. Setting \alpha = \alpha \to \texttt{Int} means \alpha contains a copy of itself, forever — an infinite type. Without the check that "a variable may not equal a term that already contains it", the solver would chase its own tail and loop. The occurs check is the guard that turns those cases into an honest type error instead of a hang. (This is exactly the self-application (f) => f(f), which HM rightly rejects.)
Inference is NOT dynamic typing. "The compiler figured out the types" does not mean "the types are checked at run time." It is the opposite: every type is nailed down statically, at compile time. The programmer simply didn't have to write them. A program that fails to unify is rejected before it ever runs — languages like Haskell and ML are as strictly, statically typed as they come.