CPU Scheduling

Your laptop has, let's say, one CPU core, but right now it is "running" a browser, a music player, a clock, a file sync, and a dozen background helpers. They cannot all run at once — a single core can execute exactly one instruction stream at a time. So at any instant the operating system holds a crowd of processes that are all ready to run, and it must answer one small question, thousands of times a second:

Of all the jobs waiting, which one runs next?

The part of the OS that answers is the scheduler, and its answer is a scheduling algorithm. This one small decision, repeated endlessly, is what makes your machine feel snappy or sluggish, fair or unfair. This page is about how that decision is made.

The vocabulary of waiting

To compare schedulers we need to measure them, and to measure them we need four times. Picture each job as needing a burst of the CPU. For a single job:

Arrival time — the moment the job enters the ready queue and starts wanting the CPU.
Burst time — how long the job needs the CPU for, if it could run uninterrupted. (This is the "size" of the job.)
Completion time — the moment the job finally finishes.

From those three we derive the two numbers we actually care about:

Turnaround time = completion − arrival. The total time from arriving to finishing — running time plus all the time spent hanging around. T_{\text{turnaround}} = t_{\text{completion}} - t_{\text{arrival}}
Waiting time = turnaround − burst. Of that turnaround, how much was spent waiting rather than running. This is the time the job sat idle in the queue while other jobs used the CPU. T_{\text{wait}} = T_{\text{turnaround}} - t_{\text{burst}}

A good scheduler makes the average waiting time across all jobs small — nobody likes to wait. Different algorithms trade this against fairness and responsiveness in different ways. Let's meet them.

Four classic schedulers

FCFS — First-Come, First-Served. The simplest rule imaginable: run jobs in the order they arrive, each to completion, no interruptions. It is a plain queue, exactly like a line at a shop.
SJF — Shortest-Job-First. Of all the ready jobs, always run the one with the smallest burst time next. Get the quick jobs out of the way first.
Round-Robin (RR). Give each job a small fixed slice of CPU called a time quantum (say 4 ms). When its slice runs out, it goes to the back of the queue and the next job gets a turn. Everyone is served a little at a time, round and round.
Priority scheduling. Give each job a priority number; always run the highest priority ready job next. (SJF is really just priority scheduling where "priority" = "shortness".)

The first two run each job to completion once it starts (they are non-preemptive); the last two can preempt — snatch the CPU away from a running job to give it to someone else. To see what really separates them, we compute.

A worked example: FCFS vs SJF on the same jobs

Three jobs all arrive at time 0 (so arrival is 0 for everyone, and turnaround = completion). Their burst times:

Job Burst time ------------------ A 6 B 8 C 2

Order 1 — FCFS (run them A, B, C as they arrived)

Each job waits for everyone before it to finish. A runs first (waits 0), B waits for A (6), C waits for A and B (6 + 8 = 14):

Job Burst Start Completion Turnaround Waiting -------------------------------------------------------- A 6 0 6 6 0 B 8 6 14 14 6 C 2 14 16 16 14 -------------------------------------------------------- Average waiting = (0 + 6 + 14) / 3 = 20 / 3 ≈ 6.67 Average turnaround = (6 + 14 + 16) / 3 = 36 / 3 = 12.0

Poor little C only needed the CPU for 2 units, yet it waited 14 — stuck behind the two big jobs. That is the famous convoy effect: short jobs pile up behind a long one like cars stuck behind a slow truck.

Order 2 — SJF (run the shortest first: C, A, B)

Same jobs, same arrival, but now we serve them shortest-first — C (2), then A (6), then B (8):

Job Burst Start Completion Turnaround Waiting -------------------------------------------------------- C 2 0 2 2 0 A 6 2 8 8 2 B 8 8 16 16 8 -------------------------------------------------------- Average waiting = (0 + 2 + 8) / 3 = 10 / 3 ≈ 3.33 Average turnaround = (2 + 8 + 16) / 3 = 26 / 3 ≈ 8.67

Same jobs, same machine — but the average waiting time dropped from 6.67 to 3.33, cut in half, just by reordering. By clearing the small job first, everybody behind it starts sooner. This is no accident:

For a fixed set of jobs available at the same time, Shortest-Job-First gives the minimum possible average waiting time of any scheduler. No ordering does better.
The intuition: a job's burst adds to the wait of every job behind it, so you pay for a long job many times over. Doing short jobs first makes those multiplied delays as small as they can be.

Seeing it: a Gantt chart

A Gantt chart is the scheduler's timeline — a bar showing which job owns the CPU during each stretch of time, laid end to end. Here is the FCFS schedule above (A, then B, then C) drawn on a time axis. Step through it to watch each job take its turn:

Read left to right = forward in time. The width of each block is its burst time, and where a block starts is exactly when that job begins running — so the left edge of C sitting at time 14 is C's long, unfair wait, drawn to scale.

Compute it yourself

Here is the FCFS calculation as a tiny program. It takes a list of burst times (all arriving at time 0), lays them out in order, and prints each job's waiting and turnaround time plus the averages. Change the bursts array and press Run to see how the order and the sizes change the averages — try sorting them smallest-first to turn FCFS into SJF:

// FCFS scheduler: jobs all arrive at time 0, run in array order. const bursts: number[] = [6, 8, 2]; // burst time of each job const names: string[] = ["A", "B", "C"]; let clock = 0; // when the CPU becomes free let totalWait = 0; let totalTurn = 0; console.log("Job Burst Start Finish Turnaround Waiting"); console.log("-----------------------------------------------"); for (let i = 0; i < bursts.length; i++) { const start = clock; // this job waits until the CPU is free const finish = start + bursts[i]; // then runs to completion const waiting = start; // arrival is 0, so waiting = start time const turnaround = finish; // arrival is 0, so turnaround = finish time totalWait += waiting; totalTurn += turnaround; console.log( names[i].padEnd(5) + String(bursts[i]).padEnd(7) + String(start).padEnd(7) + String(finish).padEnd(8) + String(turnaround).padEnd(12) + String(waiting), ); clock = finish; // CPU is busy until this job finishes } const n = bursts.length; console.log("-----------------------------------------------"); console.log("Average waiting = " + (totalWait / n).toFixed(2)); console.log("Average turnaround = " + (totalTurn / n).toFixed(2));

Notice the two lines that do all the pedagogy: waiting = start and turnaround = finish. Because everything arrives at 0, a job's wait is simply how long the CPU was busy with the jobs ahead of it.

The trade-offs — why not always use SJF?

SJF wins on average waiting time, so why does no real OS use pure SJF? Because "best average" hides two problems, and the other algorithms fix them:

SJF risks starvation. If short jobs keep arriving, a long job may never reach the front — it is always shoved aside by something shorter. Its average was great; that one job's experience was catastrophic. (Real systems fix this with aging: a job's priority slowly rises the longer it waits.)
SJF needs to know the future. You cannot know a job's burst time before it runs. Real schedulers estimate it from recent history — an educated guess, not a certainty.
FCFS suffers the convoy effect. Dead simple and perfectly fair on arrival order, but one long job at the front makes everyone behind it crawl — terrible for responsiveness.
Round-Robin is fair and responsive. By slicing time into quanta, no job waits longer than one lap of the queue before getting a turn. Interactive jobs feel instant. The price is slightly worse average turnaround, because everyone is nibbled at rather than finished off.

There is no universal winner: batch systems that value throughput lean toward SJF; interactive systems that value responsiveness lean toward Round-Robin; and real operating systems blend several ideas (multi-level feedback queues) to get the best of each.

A very common slip: thinking that a job's waiting time includes its own burst. It does not. Waiting time counts only the time spent sitting in the queue not running — that's why we subtract the burst: T_{\text{wait}} = T_{\text{turnaround}} - t_{\text{burst}}. The very first job in FCFS (arriving at 0) has a waiting time of zero, even though it runs for its whole burst. It never waited for anyone; it just worked.

Round-Robin's fairness comes from its time quantum — so surely a smaller quantum is always better, giving everyone turns even sooner? No! Every time the CPU switches from one job to another it performs a context switch: saving one job's registers and loading the next's. That takes real time and does no useful work.

With a tiny quantum the CPU spends more of its life switching than computing — the overhead dominates and throughput collapses. With a huge quantum, Round-Robin just degenerates back into FCFS (each job finishes within its slice). The art is a quantum large enough that a typical burst finishes in a few slices, but small enough that interactive jobs still feel instant — usually a few milliseconds.