Distributed Systems

The internet, the cloud, and every large service you use is really many computers pretending to be one. Distributed systems is the study of how to make that illusion work when the machines are far apart, the network drops messages, clocks disagree, and any component can fail at any moment. Its famous, humbling lesson: some things you take for granted on one machine become provably impossible across many.

This branch covers system and failure models, logical clocks, remote procedure calls, replication and consistency, the CAP theorem, consensus (Paxos and Raft), fault tolerance, and the data-parallel model behind MapReduce. It builds on networks and operating systems.