Logical Clocks and Time

A user in London clicks "post," and a moment later a user in Sydney clicks "reply." Two machines, two clicks, one obvious question: which came first? On a single computer this is trivial — read the clock. Across a network of machines it is genuinely hard, and the reason is unsettling: there is no such thing as a single, shared "now."

Every machine has its own quartz crystal ticking away in its clock, and no two crystals tick at exactly the same rate. They drift — a cheap one by seconds per day. We patch this with protocols like NTP that nudge clocks toward a reference, but the correction is never perfect and never instantaneous: a message announcing "the time is now 12:00:00" is already stale by the time it crosses the wire. So two events on two machines might carry timestamps that disagree about their own order — machine A stamps an event 12{:}00{:}05 that truly happened after a machine-B event stamped 12{:}00{:}07, simply because A's clock runs fast. Physical time lies.

The escape is to stop asking "what time did it happen?" and start asking a weaker but answerable question: "could this event have influenced that one?" That is what logical clocks capture. They throw away wall-clock seconds entirely and track only the one thing distributed systems really need — the order of cause and effect.

Happens-before: ordering by causality, not by clock

In 1978 Leslie Lamport gave the field its founding definition. Forget timestamps; define an ordering purely from what a program does. Write a \rightarrow b, read "a happens-before b," and build it from exactly three rules:

These three rules stitch together a chain of causality that can hop between processes: a send links to its receive, program order links events within a process, and transitivity follows the chain as far as it goes. If a \rightarrow b, then a could possibly have affected b — information had a path to travel from one to the other.

Crucially, \rightarrow is a partial order, not a total one. Recall from relations as sets of ordered pairs that a relation is just a subset of pairs, and a partial order is one that is irreflexive (here nothing happens-before itself) and transitive — but need not connect every pair. Some pairs of events are simply unrelated: neither a \rightarrow b nor b \rightarrow a holds. We call such events concurrent, written a \parallel b. Concurrent does not mean "at the same instant" — it means causally independent: no chain of messages and program steps connects them either way, so neither could have influenced the other.

See it on a timeline

Here are three processes — P_1, P_2, P_3 — as vertical lifelines, with time flowing downward. Dots are events; slanted arrows are messages carrying causality from one lifeline to another. Each event is tagged with its Lamport timestamp (the integer we build in the next card). Step through it and watch the numbers grow along every causal chain.

Along the causal chain a \rightarrow b \rightarrow c \rightarrow e \rightarrow f the timestamps climb 1, 2, 3, 4, 5 — never flat, never backwards. But event d on P_3 also carries a 1, the same as a on P_1, even though the two events have nothing to do with each other. Hold that thought — it is the crack that the "Watch out!" box below pries open.

Lamport clocks: one integer per process

Lamport's clock is almost embarrassingly simple. Each process keeps a single integer counter L, starting at 0, and follows two rules:

The \max is the whole trick: receiving a message drags your clock up past the sender's, guaranteeing the receive is stamped later than the send. This buys the clock condition:

We can still squeeze a total order out of these integers when we need one (say, to agree on a single global sequence): break ties by process id. Define a \Rightarrow b iff L(a) < L(b), or L(a) = L(b) and a's process id is smaller. That tie-break is arbitrary but consistent with happens-before — it never contradicts a real causal arrow, it just invents an order for the concurrent pairs that had none. Handy for building things like distributed mutual exclusion, where every process must agree on some order of requests.

Run a Lamport clock yourself

The program below replays a small event log across three processes and computes every Lamport timestamp by hand, using only the two rules above. Sends are matched to receives by a message tag. Press Run ▶ and read the stamps down each process — then compare them to the timeline figure.

// A tiny Lamport-clock simulator over a fixed event log. // Each entry is one event, in the real order it occurred. // local : a private step on a process // send : process P emits a message with tag `msg` // recv : process P receives the message with tag `msg` type Kind = "local" | "send" | "recv"; interface Event { proc: string; kind: Kind; name: string; msg?: string; } const log: Event[] = [ { proc: "P1", kind: "local", name: "a" }, { proc: "P1", kind: "send", name: "b", msg: "m1" }, { proc: "P2", kind: "recv", name: "c", msg: "m1" }, { proc: "P3", kind: "local", name: "d" }, { proc: "P2", kind: "send", name: "e", msg: "m2" }, { proc: "P3", kind: "recv", name: "f", msg: "m2" }, ]; const clock: Record<string, number> = { P1: 0, P2: 0, P3: 0 }; const sent: Record<string, number> = {}; // msg tag -> timestamp when sent for (const ev of log) { if (ev.kind === "recv") { const t = sent[ev.msg as string]; // sender's stamp clock[ev.proc] = Math.max(clock[ev.proc], t) + 1; } else { clock[ev.proc] = clock[ev.proc] + 1; // local or send: just tick if (ev.kind === "send") sent[ev.msg as string] = clock[ev.proc]; } const note = ev.kind === "recv" ? " (max then +1)" : ""; console.log(ev.proc + " " + ev.name + " (" + ev.kind + ") -> L = " + clock[ev.proc] + note); }

Notice event d on P_3 lands at L = 1, tied with a on P_1. The single integer simply cannot tell you those two are unrelated. To capture concurrency exactly, we need more than one number.

Vector clocks: capturing causality exactly

A Lamport clock loses information: from two timestamps alone you can tell that a \rightarrow b forces L(a) < L(b), but you can't run the implication the other way. The fix, due to Fidge and Mattern, is to give each process not one counter but a whole vector — one slot per process.

With n processes, process P_i keeps a vector V of n integers, all starting at 0. Slot V[j] means "the number of events on P_j that I know about." The rules:

Compare two vectors component-wise. Say V \le W when V[k] \le W[k] for every k, and V < W when additionally they differ in at least one slot. Then the whole theory snaps into place:

Take our timeline. Order the slots as (P_1, P_2, P_3). Event a on P_1 has vector (1, 0, 0). The lonely event d on P_3 has (0, 0, 1). Compare them: slot P_1 says a is ahead (1 > 0), slot P_3 says d is ahead (1 > 0). Neither dominates — incomparable — so a \parallel d. The vectors prove the concurrency that the tied Lamport 1s could only hint at.

Meanwhile event f on P_3 — the end of the chain a \rightarrow b \rightarrow c \rightarrow e \rightarrow f — carries (1, 2, 2): it has heard of one event on P_1, two on P_2, and two on P_3 (including itself). Since (1,0,0) \le (1,2,2) and they differ, a \rightarrow f — read straight off the vectors.

Two clocks, two jobs

Both schemes track causality without any physical clock at all; they differ in what they preserve and what they cost.

They often don't — not by wall-clock time, anyway. Systems like Amazon's Dynamo attach a vector clock to each stored value. When a read finds two versions whose vectors are incomparable, the system knows the writes were genuinely concurrent — two clients updated the same key without seeing each other — and it cannot silently pick a winner. So it keeps both versions and hands the conflict up to the application (or a merge function) to resolve, exactly the way Git surfaces a merge conflict instead of guessing. Google's Spanner takes the opposite bet: it spends real money on GPS receivers and atomic clocks (its "TrueTime" API) to shrink clock uncertainty to a few milliseconds and then waits out that uncertainty before committing. Buy better physical clocks, or track logical causality — every distributed database picks a point on that spectrum.

The subtle mistake

This is the single most common error with logical clocks, so burn it in: with Lamport clocks, L(a) < L(b) tells you nothing on its own about whether a \rightarrow b. The clock condition only runs one way — causality forces increasing stamps, but increasing stamps can arise between two events with no causal link at all.

Concretely: in our timeline, event d on P_3 has L(d) = 1 and event c on P_2 has L(c) = 3. So L(d) < L(c) — yet d \not\rightarrow c: no chain of messages connects them, they are concurrent. If you sort events by Lamport timestamp and conclude "d therefore caused / preceded c," you have invented a causal link that does not exist. To actually decide happens-before, you need vector clocks, where incomparable vectors flag the concurrency that a lone integer hides.

The healthy mental model: a Lamport timestamp is a one-way witness. It confirms causality when you already know it exists (as a monotone number along a chain), but it can never establish causality or rule it out. Vector clocks are the two-way witness — and that extra power is exactly why they cost n integers instead of one.