Replication and Consistency

A single database on a single machine is a comfortable place to live. It has one copy of the truth, and if you wrap your changes in a transaction, that copy moves from one consistent state to the next, atomically, in isolation. Lovely. Now unplug that machine. Your entire service is gone — every read, every write, until someone drives to the data centre. And even while it was running, a user in Sydney was waiting 200 milliseconds for every request to cross the Pacific to your one server in Virginia.

The cure for both problems is the same: keep more than one copy. This is replication — storing the same data on several machines (called replicas) — and it buys you four things at once:

Fault tolerance — one replica dies, the others carry on; no data lost.
Availability — the system keeps answering even while a machine is down or being upgraded.
Read scalability — ten replicas can serve ten times the read traffic, because any of them can answer a read.
Low latency — put a replica near your users, and their reads travel metres instead of oceans.

There is no free lunch, of course. The instant you have two copies of a value and someone writes a new one, the copies disagree until the change propagates. The whole art of this page is a single question: how much disagreement will you tolerate, and what does the reader see while the copies catch up? That question is answered by choosing a consistency model.

The consistency spectrum

Consistency models form a spectrum, from strict-and-slow to loose-and-fast. The two anchors at the ends are the ones to know cold.

At the strict end is strong consistency, whose gold standard is linearizability: the system behaves as if there were a single copy of the data. Every read returns the value of the most recent write, and once a write is acknowledged, no subsequent reader anywhere can ever see the old value again. Reads and writes appear to take effect instantaneously, in a single global order that everybody agrees on. This is exactly the single-node illusion we are trying to preserve across many machines — and it is expensive, because the replicas must coordinate on every operation before answering.

At the loose end is eventual consistency: if writes stop, all replicas will eventually converge to the same value — but in the meantime they may hand out different, stale answers. A write to one replica trickles out to the others in the background; a read that lands on a replica the update has not yet reached sees the old value. The promise is only about the destination, not the journey: given enough quiet time, everyone agrees. This is cheap and fast, because a replica can answer immediately without asking anyone.

Strong (linearizable) — the system acts like one copy; every read sees the latest acknowledged write, in one global order. Correct, but requires coordination on every operation.
Eventual — replicas may temporarily disagree, but converge once writes stop. Fast and available, but a reader can see stale data.

Between the extremes live useful half-measures — session guarantees that fix the most jarring surprises without paying full price:

Read-your-writes — after you write something, your own later reads are guaranteed to see it (even if other users briefly do not). No more posting a comment and watching it vanish on refresh.
Monotonic reads — once you have seen a value, you never see an older one on a later read. Time never appears to run backwards for a single reader.

Which one do you actually need?

The honest answer is: it depends on the data, and the same application usually wants both in different places. Picture two numbers on a social app.

The like counter on a post is a perfect fit for eventual consistency. If you see 1{,}000 likes and your friend momentarily sees 1{,}001, nobody is harmed; a second later both screens read the same number. In exchange, the counter is enormously fast and stays up even when machines fail. Trading a flicker of disagreement for speed and availability is a bargain here.

A bank balance is the opposite. If the ledger says \$100 and you withdraw \$100, no other replica may still answer "\$100 available" and let a second withdrawal through. Money demands strong consistency: every read must reflect the latest committed write, because a stale read here creates money out of nothing. This is the single-node ACID guarantee — atomic, isolated, durable state transitions — and replication must not quietly break it.

So the choice is not "which is better" but "what is the cost of a stale read for this piece of data?" A wrong like-count costs nothing; a wrong balance costs a lawsuit.

The name scares people into thinking eventual consistency is a euphemism for "sometimes we lose your data" or "the numbers are just wrong." Neither is true. Every acknowledged write is durably stored and will reach every replica; nothing is discarded. The only thing "eventual" describes is timing — there is a brief window where different replicas are at different points in the same, agreed-upon history, so a read might land on one that is a beat behind. Once the writes stop, that window closes and everyone converges to the same value. It is stale-for-a-moment, not lost and not wrong. (Well-designed systems even bound that moment and reconcile conflicting concurrent writes with rules like last-writer-wins or mergeable data types, so the convergence is deterministic, not a coin flip.)

Quorums: buying overlap with arithmetic

Here is the elegant trick that lets you tune where you sit on the spectrum, without coordinating on every operation. Suppose you have N replicas of a value. Instead of writing to all of them or reading from just one, you pick two numbers:

W — the write quorum: a write is not acknowledged until at least W replicas have stored it.
R — the read quorum: a read asks R replicas and takes the newest value among the answers.

Now the magic. If we insist that

W + R > N,

then the set of replicas a write touched and the set a read touches must overlap in at least one replica — there is simply no room for them to be disjoint. And that one shared replica has the latest write, so the read is guaranteed to see it. This is the pigeonhole principle wearing a database hat: two sets whose sizes add up to more than the whole cannot avoid each other.

The figure below makes the overlap literal. Five replicas sit in a ring. A write lands on W = 3 of them; a later read contacts R = 3 of them. Because 3 + 3 = 6 > 5, they cannot dodge one another — they share a replica, and the read sees the write. Step through it:

A worked example, and the tuning dial

Take the common default N = 3 and try a few settings.

W = 2, R = 2. Then W + R = 4 > 3 = N. A write reaches 2 of the 3 replicas; a read asks 2 of the 3. Two pairs drawn from three replicas must share at least one — overlap guaranteed, so every read sees the latest acknowledged write. This is the classic strongly-consistent quorum, and it still tolerates one replica being down.
W = 1, R = 1. Then W + R = 2 \not> 3. The write might land only on replica A while the read asks only replica C — no overlap, so the read can return a stale value. Blazing fast and highly available (you only ever wait for one replica), but you have opted into eventual consistency.

So W and R are a dial. Crank R up (read from more replicas) for fresher reads at the cost of slower reads; crank W up for stronger write guarantees at the cost of slower writes. The condition W + R > N is the line between "reads may be stale" and "reads are guaranteed fresh."

There is a second, separate condition worth its own line. To stop two different writes from both being acknowledged at the same time (which would leave two conflicting "latest" values), we also want write quorums to overlap each other:

W > \tfrac{N}{2}.

If every write needs a strict majority, two writes cannot both succeed on disjoint sets — one of them must fail or wait, so there is always a single agreed latest value. With N = 3 that means W \ge 2; with N = 5, W \ge 3. Majority quorums are why so many systems default to odd N.

It is tempting to think "if I add more copies, my data gets more consistent." The opposite is closer to the truth. Consistency comes from the overlap W + R > N, not from the raw number of replicas. If you double N from 3 to 6 but keep W = R = 2, then W + R = 4 \not> 6 — you have lost your consistency guarantee, because a read and a write can now easily miss each other among the extra copies. More replicas give you more fault tolerance and more read throughput, but they force you to raise W and/or R to keep the same freshness. Replication is a set of dials that trade against each other, not a single "more is better" slider.

Compute the overlap yourself

The whole guarantee reduces to one question: does the set of replicas a write reached share any member with the set a read touches? If yes, the read sees the write; if no, it might be stale. The program below models a few (N, W, R) configurations as explicit sets of replica indices, checks both the arithmetic test W + R > N and the actual overlap, and reports whether a stale read is possible. Press Run ▶:

// A quorum system has N replicas (indices 0 .. N-1). A write is stored by // the replicas in `writeSet`; a later read contacts those in `readSet`. // The read is guaranteed to see the write ONLY IF the two sets overlap. function overlaps(writeSet: number[], readSet: number[]): boolean { return writeSet.some((replica) => readSet.includes(replica)); } interface Config { n: number; writeSet: number[]; readSet: number[]; } const configs: Config[] = [ { n: 3, writeSet: [0, 1], readSet: [1, 2] }, // N=3 W=2 R=2 -> 4 > 3 { n: 3, writeSet: [0], readSet: [2] }, // N=3 W=1 R=1 -> 2 <= 3, disjoint { n: 5, writeSet: [0, 1, 2], readSet: [2, 3, 4] }, // N=5 W=3 R=3 -> 6 > 5 { n: 5, writeSet: [0, 1], readSet: [3, 4] }, // N=5 W=2 R=2 -> 4 <= 5, disjoint ]; console.log("N W R W+R>N? sets overlap? result"); console.log("-------------------------------------------------------"); for (const c of configs) { const w = c.writeSet.length; const r = c.readSet.length; const arithmeticGuarantee = w + r > c.n; // the W + R > N test const reallyOverlaps = overlaps(c.writeSet, c.readSet); const verdict = reallyOverlaps ? "fresh read guaranteed" : "STALE READ POSSIBLE"; console.log( c.n + " " + w + " " + r + " " + (arithmeticGuarantee ? "yes " : "no ") + " " + (reallyOverlaps ? "yes " : "no ") + " " + verdict, ); } console.log(""); console.log("Whenever W + R > N is 'yes', the sets MUST overlap -> always fresh."); console.log("When it is 'no', overlap is not guaranteed -> a stale read can slip through.");

Notice the pattern in the output: every row where W + R > N holds also overlaps — that is the guarantee, not a coincidence. The rows where it fails are exactly the ones where a stale read can sneak in.