The CAP Theorem

You run an online shop with two data centres — one in London, one in New York — each holding a copy of every shopping cart, so customers everywhere get fast answers. This morning, a backhoe somewhere under the Atlantic severs a fibre link. London and New York can each still talk to their own customers, but they can no longer talk to each other. The network has split in two.

Now a customer in London adds a laptop to her cart. London records it. A moment later the same customer's phone, routed to New York, asks "what's in my cart?" New York never heard about the laptop — the link is down. You are cornered into a choice. Either New York answers with a stale cart (missing the laptop) to stay responsive, or it refuses to answer until it can sync, to avoid lying. There is no third door. This forced dilemma, at the heart of every distributed system, is what the CAP theorem is about.

Three guarantees, defined precisely

CAP names three properties a distributed data store might promise. The words look friendly, so people wave at them loosely — but each has a sharp technical meaning, and the theorem only makes sense once you pin them down.

The subtlety that trips everyone up: consistency and availability are in tension only while a partition is happening. When the network is healthy, a well-built system can be both consistent and available at once. The whole drama plays out in the window when nodes are alive but cut off from one another.

The forced choice, made concrete

Picture two replicas, N1 and N2, that normally copy every write to each other. A partition drops the link between them, and a client writes a new value — say, "balance = $50" — to N1. N1 stores it, but the update cannot reach N2. Now a second client asks N2 to read the balance. N2 is stuck between two promises it cannot both keep:

That is the theorem in miniature: during the partition, N2 must either serve possibly-stale data (sacrifice C) or stop serving (sacrifice A). For a bank, showing a wrong balance and letting an overdraft slip through is unacceptable — you would rather the ATM say "service unavailable." For a shopping cart or a "likes" counter, a momentarily stale answer is harmless and losing sales to downtime is not — you would rather stay available and reconcile later. The right sacrifice depends entirely on what the data is for.

CP systems and AP systems

Because the choice is forced, real systems declare in advance which side they will land on during a partition. This gives two broad families.

Neither family is "better" — CP trades uptime for correctness, AP trades correctness for uptime, and each is right for different data. Many real databases even let you pick per-operation: read with a strong quorum when you need C, or with a single replica when you need speed and A.

Why "give up P" is a trap

The popular slogan — "CAP: pick any two of three" — invites you to imagine a "CA" system that keeps consistency and availability by giving up partition tolerance. That reading is misleading, and it is worth killing off explicitly.

"Give up P" sounds like a design option, but a partition is not a feature you enable — it is something the network does to you, whether you like it or not. Cables get cut, switches reboot, packets get lost, a whole rack loses its uplink. You do not get to decree that messages never go missing. So on any real, multi-node network, P is mandatory: the system will face partitions, and it must survive them somehow.

That means "CA" is not a runtime choice you can lean on — it only describes a system that simply assumes partitions never occur (for example, a single-node database with no network between replicas at all — there are no two nodes to partition). The instant you have two nodes that must communicate, a partition is possible, and you are back to the real fork: C or A, during the partition. The honest way to read CAP is not "pick 2 of 3" but "P is a given; pick C or A for when it strikes."

CAP only speaks about the partition case, which leaves a big question unanswered: what tradeoff does a system make the rest of the time, when the network is perfectly healthy? The PACELC theorem (Daniel Abadi, 2012) extends CAP to cover both. Read it as:

if there is a Partition, choose between Availability and Consistency (that's plain CAP); Else — when things are running normally — choose between Latency and Consistency.

The second half is the insight CAP misses: even with a flawless network, guaranteeing strong consistency means every write must wait for enough replicas to acknowledge it, which costs latency. A system can choose to reply faster by relaxing consistency even when nothing is broken. So Cassandra is "PA/EL" (available under partition, low-latency otherwise) while a strongly-consistent store like a Paxos-based database is "PC/EC" (consistent always, at the cost of latency). PACELC captures the everyday tradeoff that CAP's partition-only lens leaves out.