Monitors & Condition Variables

Walk into a small doctor's surgery. There is one office, and the rule is simple: only one patient may be with the doctor at a time. When you arrive you sit in the waiting room until the office is free; then you go in. But sometimes you go in and the doctor says, "your blood test isn't back yet — go and wait in room B, I'll call you the moment it arrives." You leave the office (freeing it for the next patient!), sit in that particular waiting room, and doze until a nurse calls your name.

That surgery is a monitor. The one-patient-at-a-time office is mutual exclusion, enforced automatically. Room B — a queue you sleep in until a specific condition ("your results are back") becomes true — is a condition variable. Being called is a signal. This page is about that one idea: a monitor — shared data, the lock that guards it, and condition variables — bundled into a single construct so that synchronisation is built in rather than bolted on with raw semaphores.

What a monitor bundles together

With raw semaphores or a bare mutex, the lock and the data it protects are separate things you must remember to pair correctly. Forget one lock() on one code path and you have a race; forget one unlock() and you have a frozen program. A monitor removes the temptation to forget by wrapping three things into one object:

Shared data — the state being protected (a buffer, a counter, a bank balance), reachable only from inside the monitor's methods.
An implicit lock — every monitor method acquires it on entry and releases it on exit, automatically. At most one thread is ever active inside the monitor.
Condition variables — named wait queues, each tied to the monitor's lock, on which a thread can sleep until some other thread signals that a condition may now hold.

The payoff is that mutual exclusion is no longer your job. You do not write lock() / unlock(); you just declare that a method belongs to the monitor and the construct guarantees only one thread runs it at a time. In Java, that is literally the synchronized keyword; the object's built-in lock is the monitor.

// A monitor, sketched. Every method implicitly acquires the SAME lock on // entry and releases it on exit — so the body is always a critical section. class Counter { private count = 0; // shared data, reachable only through methods increment(): void { // implicitly: lock; ... ; unlock this.count = this.count + 1; // guaranteed exclusive — no race possible } read(): number { return this.count; } }

Condition variables: sleeping inside the monitor

Mutual exclusion alone is not enough. Often a thread gets into the monitor only to discover it can't proceed yet: a consumer finds the buffer empty; a producer finds it full. It must wait — but if it just spins holding the lock, no other thread can ever get in to change the thing it is waiting for. That is a deadlock of its own making. The condition variable is the way out. It offers three operations:

wait() — atomically release the monitor lock and put this thread to sleep on the condition's queue. When later woken, it re-acquires the lock before wait() returns. The atomic release-and-sleep is the whole point: no other thread can sneak in during the gap.
signal() (a.k.a. notify()) — wake one thread waiting on this condition. If none is waiting, it does nothing (the signal is not remembered).
broadcast() (a.k.a. notifyAll()) — wake all threads waiting on this condition. Use it when more than one waiter might now be able to proceed, or when you're not sure which one should.

The crucial detail is that wait() gives up the lock. A sleeping waiter is not hogging the monitor — the office is free for the next patient, who is exactly the thread that will eventually make the waiter's condition true and signal it. In Java these are wait(), notify(), notifyAll() on any object; in POSIX C they are pthread_cond_wait, pthread_cond_signal, pthread_cond_broadcast paired with a pthread_mutex_t.

Watch one thread wait and get signalled

The timeline below is a monitor with one active slot (the office), an entry queue (threads waiting to acquire the monitor), and a condition-variable queue we'll call notReady (threads that called wait() and gave up the lock). Read each row as a snapshot in time. Watch T1 enter, find its condition false, wait() — releasing the monitor so T2 can get in — then get moved back to the entry queue by T2's signal, and finally resume. Press play:

Notice the two queues are different. The entry queue holds threads that want the lock but have never had it. The notReady queue holds threads that had the lock, called wait(), and gave it back. A signal moves a thread from the second queue toward the first — it makes the thread runnable, not running.

Hoare vs. Mesa: who holds the lock right after a signal?

When thread T2 signals and thread T1 was waiting, there is a genuine ambiguity: T2 is still running inside the monitor, and now T1 wants to be too — but only one may be active. Two classic answers exist, and which one your system uses changes how you must write your code.

Hoare semantics — signal() immediately hands the lock (and the CPU) straight to the woken waiter, which resumes at once. The signaller blocks until the waiter gives the monitor back. The waiter is therefore guaranteed the condition still holds — but the implementation needs extra bookkeeping and context switches.
Mesa semantics — signal() merely marks the waiter runnable; the signaller keeps the lock and keeps running. The woken thread rejoins the queue for the lock and only re-enters later. By then, other threads may have run and changed the state, so the condition may no longer hold.

Almost every real system — Java, pthreads, C#, Python — uses Mesa semantics, because it is simpler and faster to implement. And Mesa has one iron consequence that every undergraduate must burn into memory: a woken thread must re-check the condition itself, because being signalled is only a hint that the condition might be true, not a promise that it is.

The golden rule: `while`, never `if`

This single line is the most important thing on the page. Guard a condition-variable wait with a while loop, not an if:

// RIGHT — re-checks the condition after every wakeup. while (!ready) { notReady.wait(); // when this returns, the loop RE-TESTS !ready } // ... here, ready is guaranteed true ... // WRONG — tests the condition only once, before ever sleeping. if (!ready) { notReady.wait(); // when this returns, we blindly assume ready is true } // ... here, ready MIGHT be false — bug! ...

There are two independent reasons the while is mandatory, and either one alone would be enough:

Mesa semantics — after your wait() returns you are back in the queue behind the signaller; another thread may have grabbed the lock first and consumed the very thing you were signalled about. The state you were promised is gone. Re-check, and if it's not there, wait again.
Spurious wakeups — real thread libraries (POSIX explicitly permits this) may return from wait() occasionally with no matching signal at all, for efficiency reasons deep in the implementation. A while loop simply notices the condition is still false and goes back to sleep; an if marches on into corrupt state.

The rule is beautifully robust: while (!condition) cv.wait(); is correct under Hoare semantics, under Mesa semantics, and in the presence of spurious wakeups. There is no situation in which the if is safer, so the while is simply always right.

Beginners write if (bufferEmpty) notEmpty.wait(); because it reads like English and usually works — which is exactly what makes it dangerous. Under Mesa semantics (i.e. Java, pthreads, almost everything), a woken consumer can find the item already taken by another consumer that raced ahead of it; with an if, it never re-checks and happily "consumes" an item that isn't there, reading past the end of a buffer or driving a count negative. Add spurious wakeups and it can misfire even with no bug in your logic. The fix is one character of thought: always guard wait() with a while loop that re-tests the exact condition. Never if.

Run it: why `while` beats `if`

The sandbox is single-threaded, so we can script the exact schedule and force the bug to happen every time. Two consumers C1 and C2 both find the buffer empty and wait(). A producer adds one item and calls notifyAll(), so both consumers wake — but there is only one item to go round. The only difference between the two runs is what a woken consumer does before consuming: an if just barrels ahead; a while re-checks first. Press Run ▶:

// Two consumers were parked on notEmpty; a producer added ONE item and // called notifyAll(), waking both. Under Mesa semantics each woken // consumer re-runs; the guard decides whether it re-checks first. type Guard = "if" | "while"; function drain(guard: Guard): void { let items = 1; // producer just added one item and notifyAll()'d // C1 and C2 were both signalled and now run one after another. for (const c of ["C1", "C2"]) { const empty = items === 0; if (guard === "while" && empty) { // while-loop re-checks: nothing here, so go back to sleep. console.log(" " + c + " woke, re-checks: items=0 => wait() again (no consume)"); continue; } // if-guard: never re-checks. while-guard: item still present. items = items - 1; const bug = items < 0 ? " <-- BUG: consumed an item that isn't there!" : ""; console.log(" " + c + " consumes => items=" + items + bug); } const verdict = items < 0 ? "CORRUPT (buffer underflow)" : "ok"; console.log(" final items = " + items + " [" + verdict + "]"); } console.log("Guard the wait with an if (WRONG):"); drain("if"); console.log(""); console.log("Guard the wait with a while (RIGHT):"); drain("while");

The if run drives items to -1: the second consumer was signalled, never re-checked, and took an item that C1 had already consumed. The while run has C2 notice the buffer is empty again and go back to sleep — no corruption. Same schedule, one keyword's difference.

Producer–consumer, re-solved with a monitor

The classic bounded-buffer problem needs two conditions: producers wait when the buffer is full, consumers wait when it is empty. So the monitor carries two condition variables, notFull and notEmpty. Compare this with the raw semaphore solution, where you juggle a mutex plus two counting semaphores by hand and a single misordered wait/signal deadlocks the lot — the monitor version simply reads like the english description of the problem:

// Bounded buffer as a monitor. Every method is implicitly mutually exclusive; // the two condition variables handle the two ways a thread can have to wait. class BoundedBuffer<T> { private readonly buf: T[] = []; private readonly capacity: number; private readonly notFull = new Condition(); // producers wait here private readonly notEmpty = new Condition(); // consumers wait here constructor(capacity: number) { this.capacity = capacity; } put(item: T): void { // implicitly locked while (this.buf.length === this.capacity) { // WHILE, not if this.notFull.wait(); // full: release lock & sleep } this.buf.push(item); this.notEmpty.signal(); // a consumer may now proceed } take(): T { // implicitly locked while (this.buf.length === 0) { // WHILE, not if this.notEmpty.wait(); // empty: release lock & sleep } const item = this.buf.shift() as T; this.notFull.signal(); // a producer may now proceed return item; } }

Two while loops, two signals, and not a single manual lock() in sight. Each method says exactly what it means: wait until there is room, then add and wake a consumer; wait until there is an item, then remove and wake a producer. That readability — synchronisation folded into the data type — is the whole reason monitors exist.

The monitor idea and its hand-the-lock-over signalling were formalised by Tony Hoare in 1974 (the same Hoare of quicksort and, to his lasting regret, the null reference — "my billion-dollar mistake"). A few years later, engineers at Xerox PARC built monitors into their systems programming language Mesa and found Hoare's immediate hand-off expensive and awkward to implement, so they chose the "just make it runnable, let it re-check later" rule instead. Mesa's pragmatic choice won: it is what Java, C#, and pthreads all use today — which is precisely why "always loop on the condition" is drilled into every concurrency course. A language design decision from a 1970s photocopier company still shapes how you must write your wait() loops.

A condition variable has no memory. If you call signal() when no thread is currently in wait(), the signal simply evaporates — unlike a semaphore, whose count would remember it. So this sequence hangs forever: a consumer is about to check the buffer, the producer runs first, adds an item and signals into the void, then the consumer checks, sees… well, that depends. The cure is the very structure we've been building: hold the monitor lock while you both check the condition and wait, so no signal can slip through the gap between "I decided to wait" and "I am asleep." Because wait() releases the lock atomically, and the signaller holds that same lock while changing state and signalling, a wakeup can never be lost. This is exactly why wait() must be married to a lock — it's not an accident of the API.