The CAP Theorem

You run an online shop with two data centres — one in London, one in New York — each holding a copy of every shopping cart, so customers everywhere get fast answers. This morning, a backhoe somewhere under the Atlantic severs a fibre link. London and New York can each still talk to their own customers, but they can no longer talk to each other. The network has split in two.

Now a customer in London adds a laptop to her cart. London records it. A moment later the same customer's phone, routed to New York, asks "what's in my cart?" New York never heard about the laptop — the link is down. You are cornered into a choice. Either New York answers with a stale cart (missing the laptop) to stay responsive, or it refuses to answer until it can sync, to avoid lying. There is no third door. This forced dilemma, at the heart of every distributed system, is what the CAP theorem is about.

Three guarantees, defined precisely

CAP names three properties a distributed data store might promise. The words look friendly, so people wave at them loosely — but each has a sharp technical meaning, and the theorem only makes sense once you pin them down.

Consistency (C) — every read returns the result of the most recent completed write, as if there were a single up-to-date copy of the data. Formally this is linearizability: all clients see operations in one agreed order, and once a write is acknowledged, no one can ever read the old value again. (This is not the "C" of ACID — same letter, different promise.)
Availability (A) — every request sent to a non-failing node gets a (non-error, non-timeout) response. The node is not allowed to hang or reply "try again later" — it must actually answer, with some value.
Partition tolerance (P) — the system keeps operating even when the network drops or arbitrarily delays messages between nodes, splitting them into groups that cannot reach each other. A partition is exactly this loss of communication between otherwise-alive nodes.

The subtlety that trips everyone up: consistency and availability are in tension only while a partition is happening. When the network is healthy, a well-built system can be both consistent and available at once. The whole drama plays out in the window when nodes are alive but cut off from one another.

When a network partition occurs, a distributed system can guarantee consistency or availability, but not both.
In a real network, partitions will happen — so P is not optional. The practical decision is therefore not "pick 2 of 3" but "when (not if) a partition strikes, do we sacrifice C or A?"

The forced choice, made concrete

Picture two replicas, N1 and N2, that normally copy every write to each other. A partition drops the link between them, and a client writes a new value — say, "balance = $50" — to N1. N1 stores it, but the update cannot reach N2. Now a second client asks N2 to read the balance. N2 is stuck between two promises it cannot both keep:

Answer anyway (choose A) — N2 replies with the value it has, "$100", which is now stale. The system stayed available, but two clients just saw two different balances: consistency is broken.
Refuse to answer (choose C) — N2 says "I can't confirm I'm up to date, come back later" (an error or a timeout). No one is ever handed a wrong number, so consistency holds — but N2 stopped being available.

That is the theorem in miniature: during the partition, N2 must either serve possibly-stale data (sacrifice C) or stop serving (sacrifice A). For a bank, showing a wrong balance and letting an overdraft slip through is unacceptable — you would rather the ATM say "service unavailable." For a shopping cart or a "likes" counter, a momentarily stale answer is harmless and losing sales to downtime is not — you would rather stay available and reconcile later. The right sacrifice depends entirely on what the data is for.

CP systems and AP systems

Because the choice is forced, real systems declare in advance which side they will land on during a partition. This gives two broad families.

CP — consistency over availability. When the network splits, the system keeps its data consistent by refusing requests it cannot safely serve. A common design uses a majority quorum: only the side holding more than half the nodes may accept writes; the minority side rejects them rather than diverge. If you're stranded on the minority side, you get errors — consistency preserved, availability sacrificed. Examples: etcd, ZooKeeper, HBase, and consensus-based stores built on consensus algorithms such as Paxos or Raft.
AP — availability over consistency. Every reachable node keeps answering, even in the minority, accepting reads and writes on both sides of the split. The copies drift apart, and the system reconciles them later — merging versions once the partition heals, using techniques like vector clocks or last-write-wins. This is the Dynamo lineage: Amazon Dynamo, Cassandra, Riak. You always get an answer; it just might be temporarily out of date (eventual consistency).

Neither family is "better" — CP trades uptime for correctness, AP trades correctness for uptime, and each is right for different data. Many real databases even let you pick per-operation: read with a strong quorum when you need C, or with a single replica when you need speed and A.

Why "give up P" is a trap

The popular slogan — "CAP: pick any two of three" — invites you to imagine a "CA" system that keeps consistency and availability by giving up partition tolerance. That reading is misleading, and it is worth killing off explicitly.

"Give up P" sounds like a design option, but a partition is not a feature you enable — it is something the network does to you, whether you like it or not. Cables get cut, switches reboot, packets get lost, a whole rack loses its uplink. You do not get to decree that messages never go missing. So on any real, multi-node network, P is mandatory: the system will face partitions, and it must survive them somehow.

That means "CA" is not a runtime choice you can lean on — it only describes a system that simply assumes partitions never occur (for example, a single-node database with no network between replicas at all — there are no two nodes to partition). The instant you have two nodes that must communicate, a partition is possible, and you are back to the real fork: C or A, during the partition. The honest way to read CAP is not "pick 2 of 3" but "P is a given; pick C or A for when it strikes."

CAP only speaks about the partition case, which leaves a big question unanswered: what tradeoff does a system make the rest of the time, when the network is perfectly healthy? The PACELC theorem (Daniel Abadi, 2012) extends CAP to cover both. Read it as:

if there is a Partition, choose between Availability and Consistency (that's plain CAP); Else — when things are running normally — choose between Latency and Consistency.

The second half is the insight CAP misses: even with a flawless network, guaranteeing strong consistency means every write must wait for enough replicas to acknowledge it, which costs latency. A system can choose to reply faster by relaxing consistency even when nothing is broken. So Cassandra is "PA/EL" (available under partition, low-latency otherwise) while a strongly-consistent store like a Paxos-based database is "PC/EC" (consistent always, at the cost of latency). PACELC captures the everyday tradeoff that CAP's partition-only lens leaves out.