Deadlock & Livelock

Picture a busy four-way intersection at rush hour where the traffic lights have failed. A car edges into the box heading north, blocking the car that wants to go east. That eastbound car, now sitting in the box, blocks the one going south. The southbound car blocks the westbound one — and the westbound car blocks our original northbound driver. Every car is waiting for the car in front to move, and the car in front is you, four hops around the circle. Horns blare, nobody has crashed, nobody is breaking a rule — and yet not a single car can advance. Gridlock.

That is a deadlock: a ring of threads in which each one holds a resource the next one needs, and every one of them is frozen, waiting for a neighbour that will never let go. It is the dark side of the very locks and mutexes we reach for to keep concurrent code correct. Making a thread wait for a lock is exactly what stops two threads trampling each other's data — and exactly what can wedge them solid.

This page is about the three ways concurrent progress can stall even when no single thread has a bug: deadlock (a frozen cycle), livelock (threads busily doing nothing), and starvation (one thread perpetually shoved to the back of the queue). They are cousins, they are easy to confuse, and telling them apart is half the battle. (The operating-systems module treats deadlock detection and the banker's algorithm in depth under deadlock; here we take the concurrency-programmer's angle and add livelock and starvation to the picture.)

The smallest deadlock you can write

You do not need a philosophers' banquet to deadlock. Two threads and two locks will do it. Both threads need both locks. Thread A takes m1 then reaches for m2; thread B takes m2 then reaches for m1. Read the timeline down the middle:

Thread A Thread B --------------------------- --------------------------- lock(m1) ✓ got m1 lock(m2) ✓ got m2 lock(m2) … waiting for m2 lock(m1) … waiting for m1 ⤷ A holds m1, wants m2 — B holds m2, wants m1 ⤷ each waits for the other → FROZEN FOREVER

Notice how little it takes. No crash, no corrupted data, no obviously wrong line of code — just two perfectly correct threads that happened to grab the same two locks in the opposite order at the wrong instant. The interleaving is a matter of luck, which is what makes deadlocks so vicious: the program runs fine ten thousand times and wedges on the ten-thousand-and-first, on a customer's machine, at 3 a.m.

When exactly can it happen? The four Coffman conditions

In 1971 Edward Coffman and colleagues pinned down precisely what a deadlock requires. The elegant part is that all four conditions must hold at the same time — knock out any single one and deadlock becomes structurally impossible. That single fact is the lever behind every prevention strategy.

A deadlock can occur only if all four of these hold simultaneously:

Read them back against the gridlocked intersection: the box holds one car at a time (mutual exclusion); each car sits in its spot while wanting the next spot (hold and wait); no car can be lifted out by a crane (no preemption); and each waits on the next, all the way around (circular wait). Remove any one — a traffic cop who waves a car back (preemption), or a rule that you may not enter the box unless your exit is clear (no hold-and-wait) — and the jam dissolves. The magic number is not "fix all four"; it is "break any one."

Seeing the cycle: the wait-for graph

The clearest way to spot a deadlock is to draw it. In a resource-allocation graph we use two shapes and two kinds of arrow:

Step through the graph below. Thread A holds m_1 and wants m_2; thread B holds m_2 and wants m_1. Watch the last arrow close the loop, and then watch the final step break it by imposing a lock order:

The four arrows form a directed cycle A \to m_2 \to B \to m_1 \to A. With one unit of each lock, that cycle is the deadlock — a perfect picture of the circular-wait condition. For single-instance locks, detecting deadlock reduces to a question you already know how to answer: does this directed graph contain a cycle?

Prevention: break one pillar

Because all four Coffman conditions are needed at once, prevention just means making one of them structurally impossible. Three practical techniques a concurrency programmer actually reaches for, each attacking a different pillar:

Lock ordering is the one you should reach for first: it is cheap, static, and provably kills the cycle. The other two trade a guarantee for flexibility — and, as we'll see, back-off has a trap of its own.

The canonical deadlock story is Dijkstra's Dining Philosophers. Five philosophers sit around a table; between each pair is a single fork. To eat, a philosopher needs both neighbouring forks. If every philosopher picks up their left fork at once, every fork is held, every philosopher waits for their right fork — a five-way circular wait. Classic deadlock.

Two famous fixes map straight onto the pillars above. The resource hierarchy solution numbers the forks and makes everyone pick up the lower-numbered fork first — that is lock ordering, and it breaks the cycle because at least one philosopher (the one straddling the highest and lowest fork) reaches for them in the opposite direction. The waiter solution introduces an arbitrator who only lets a philosopher pick up forks if both are free — that negates hold and wait. Same dinner, two different pillars knocked out.

Run it: deadlock, then fix it with lock ordering

Our sandbox runs on a single thread, so it can't really hang — which is perfect, because it means we can script the exact interleaving that deadlocks and detect it by inspecting who-holds-what, rather than freezing the page. The program below plays the bad schedule (A takes m1, B takes m2, then each reaches for the other's lock) and reports the frozen state. Then it re-runs with a global lock ordering — both threads always take m1 before m2 — and shows it completes. Press Run ▶:

// A tiny model of two locks and two threads. Each lock records which // thread holds it (or null). A thread that can't get a lock is "blocked // and still wanting" it — we record that instead of actually hanging. interface Lock { name: string; heldBy: string | null; } function makeLock(name: string): Lock { return { name, heldBy: null }; } // Try to acquire `lock` for `who`. Returns true on success. If it's held // by someone else, `who` is now blocked wanting it (recorded by caller). function tryAcquire(lock: Lock, who: string): boolean { if (lock.heldBy === null) { lock.heldBy = who; return true; } return lock.heldBy === who; // re-entrant success, else blocked } // ---- Attempt 1: opposite lock orders -> DEADLOCK ------------------- function deadlockRun(): void { const m1 = makeLock("m1"); const m2 = makeLock("m2"); console.log("Attempt 1 — A: m1 then m2 | B: m2 then m1"); tryAcquire(m1, "A"); // A grabs m1 tryAcquire(m2, "B"); // B grabs m2 console.log(" A holds " + m1.heldBy + ", B holds " + m2.heldBy); const aGetsM2 = tryAcquire(m2, "A"); // A wants m2 (held by B) const bGetsM1 = tryAcquire(m1, "B"); // B wants m1 (held by A) if (!aGetsM2 && !bGetsM1) { console.log(" DEADLOCK: A holds m1 waiting m2; B holds m2 waiting m1"); console.log(" (each waits for a lock the other will never release)"); } else { console.log(" completed"); } } // ---- Attempt 2: global lock ordering (m1 before m2) -> SUCCESS ----- // Both threads take locks in the SAME order. Model B waiting at the // first lock (holding nothing) until A has finished and released. function orderedRun(): void { const m1 = makeLock("m1"); const m2 = makeLock("m2"); console.log("Attempt 2 — BOTH: m1 then m2 (global order)"); // Thread A runs its whole critical section first. tryAcquire(m1, "A"); tryAcquire(m2, "A"); console.log(" A holds m1 and m2, does its work..."); m2.heldBy = null; m1.heldBy = null; // A releases in reverse order console.log(" A released both"); // B was blocked at m1 (holding NOTHING), so no cycle was possible. tryAcquire(m1, "B"); tryAcquire(m2, "B"); console.log(" B now holds m1 and m2, does its work..."); m2.heldBy = null; m1.heldBy = null; console.log(" B released both — COMPLETED, no deadlock"); } deadlockRun(); console.log(""); orderedRun();

Same two locks, same two threads — only the order in which they acquire changed. In attempt 1 the opposite orders form a cycle; in attempt 2 the shared order means whoever loses the race for m1 simply waits at the first step, holding nothing, so there is nothing to deadlock on.

Livelock: busy, and going nowhere

Deadlock's sneaky sibling looks the opposite from the outside and yet achieves the same nothing. Picture two people meeting in a narrow corridor. You step left to let them pass; they step to their right — the same side — so you are face to face again. You both apologise, step the other way, and once more you mirror each other. You are both moving, both being polite, both responding — and neither of you gets past. That is a livelock.

In code, livelock is the trap in the try-lock-and-back-off scheme above. Two threads each grab their first lock, fail to get the second, politely release and retry — and if they retry in lockstep, they collide again on the next round, and the next, forever. Nobody is blocked. The threads are running flat out, the CPU is pegged at 100%, and progress is exactly zero.

The cure is to break the symmetry that keeps the two threads in perfect step. Add randomised back-off: after failing, each thread waits a random little while before retrying. Now the two retries almost never line up, one thread slips through first, and the dance ends. (Ethernet's collision handling and countless network retry loops use exactly this "exponential back-off with jitter" idea for the same reason.)

It is tempting to diagnose "the program is stuck, must be a deadlock." But a deadlocked program is quiet: its threads are asleep, CPU usage near zero, and a debugger shows them all parked inside lock(). A livelocked program is loud: CPU pinned at 100%, threads racing through the same retry loop over and over, making zero forward progress. Same symptom to the user ("it hangs"), opposite fingerprint under the hood. If your "hung" program is burning a whole core, suspect livelock — and reach for jitter, not for lock ordering.

Starvation: everyone eats but you

The third failure is subtler still, because the system as a whole is making progress — just not for one unlucky thread. Starvation is when a thread is perpetually denied a resource it needs, while other threads keep sailing through. It isn't frozen and it isn't spinning uselessly; it is simply, endlessly, passed over.

The usual culprit is unfair scheduling. Imagine a lock that, whenever it is released, is handed to the highest-priority waiter. A steady stream of high-priority threads arrives, and a lone low-priority thread that has been waiting since the start never gets its turn — there is always someone "more important" ahead of it. Or a reader/writer lock that always favours readers: as long as one reader is always reading, a writer waiting to update can wait indefinitely.

The fix is fairness: guarantee bounded waiting so that once a thread asks for a resource, only a limited number of others can jump ahead of it before it is served. A first-come, first-served (FIFO) queue on the lock, or a fair lock that hands ownership to the longest waiter, turns "might wait forever" into "waits at most n-1 turns." Fairness costs a little throughput, but it is the price of never abandoning a thread.

A close relative of starvation is priority inversion. A low-priority thread grabs a lock; a high-priority thread then needs the same lock and blocks — so the "important" thread is now waiting on the "unimportant" one. Worse, if medium-priority threads keep preempting the low-priority holder, it never runs, never releases the lock, and the high-priority thread is stuck behind it indefinitely.

This famously struck NASA's 1997 Mars Pathfinder: a high-priority bus-management task kept missing its deadline because a low-priority task held a shared lock while being starved by medium-priority work, and the watchdog timer rebooted the spacecraft — repeatedly, millions of miles away. The fix, applied by radio uplink, was priority inheritance: while a high-priority thread waits on a lock, the holder temporarily inherits that high priority so it can finish and release quickly. A tidy patch to a very expensive lesson.

Three ways to make no progress

It's worth holding all three side by side, because the everyday symptom — "it's hung" — is the same, while the cause and the cure differ completely:

FailureAre the threads running?Who makes progress?Typical cure
Deadlock No — blocked on locks Nobody in the cycle Break a Coffman condition (e.g. lock ordering)
Livelock Yes — spinning at 100% CPU Nobody — they keep reacting Break the symmetry (randomised back-off / jitter)
Starvation Yes — the system runs Some do; one is perpetually denied Fairness / bounded waiting (FIFO, fair locks)

Deadlock is a property of a whole set of threads at once; livelock and starvation are about a thread (or two) failing to advance while, in starvation's case, the world moves on without them. Knowing which one you're looking at tells you which tool to grab.