CPU Scheduling
Your laptop has, let's say, one CPU core, but right now it is "running" a browser, a music player, a
clock, a file sync, and a dozen background helpers. They cannot all run at once — a single
core can execute exactly one instruction stream at a time. So at any instant the
operating system holds a crowd of
processes
that are all ready to run, and it must answer one small question, thousands of times
a second:
Of all the jobs waiting, which one runs next?
The part of the OS that answers is the scheduler, and its answer is a
scheduling algorithm. This one small decision, repeated endlessly, is what makes
your machine feel snappy or sluggish, fair or unfair. This page is about how that decision is made.
The vocabulary of waiting
To compare schedulers we need to measure them, and to measure them we need four times. Picture each
job as needing a burst of the CPU. For a single job:
-
Arrival time — the moment the job enters the ready queue and starts wanting the
CPU.
-
Burst time — how long the job needs the CPU for, if it could run uninterrupted.
(This is the "size" of the job.)
-
Completion time — the moment the job finally finishes.
From those three we derive the two numbers we actually care about:
-
Turnaround time = completion − arrival. The total time from arriving to
finishing — running time plus all the time spent hanging around.
T_{\text{turnaround}} = t_{\text{completion}} - t_{\text{arrival}}
-
Waiting time = turnaround − burst. Of that turnaround, how much was spent
waiting rather than running. This is the time the job sat idle in the queue while other
jobs used the CPU.
T_{\text{wait}} = T_{\text{turnaround}} - t_{\text{burst}}
A good scheduler makes the average waiting time across all jobs small — nobody likes
to wait. Different algorithms trade this against fairness and responsiveness in different ways. Let's
meet them.
Four classic schedulers
-
FCFS — First-Come, First-Served. The simplest rule imaginable: run jobs in the
order they arrive, each to completion, no interruptions. It is a plain queue, exactly like a line
at a shop.
-
SJF — Shortest-Job-First. Of all the ready jobs, always run the one with the
smallest burst time next. Get the quick jobs out of the way first.
-
Round-Robin (RR). Give each job a small fixed slice of CPU called a time
quantum (say 4 ms). When its slice runs out, it goes to the back of the queue and the
next job gets a turn. Everyone is served a little at a time, round and round.
-
Priority scheduling. Give each job a priority number; always run the highest
priority ready job next. (SJF is really just priority scheduling where "priority" = "shortness".)
The first two run each job to completion once it starts (they are non-preemptive); the last
two can preempt — snatch the CPU away from a running job to give it to someone else. To see
what really separates them, we compute.
A worked example: FCFS vs SJF on the same jobs
Three jobs all arrive at time 0 (so arrival is 0
for everyone, and turnaround = completion). Their burst times:
Job Burst time
------------------
A 6
B 8
C 2
Order 1 — FCFS (run them A, B, C as they arrived)
Each job waits for everyone before it to finish. A runs first (waits 0), B waits for A (6), C waits
for A and B (6 + 8 = 14):
Job Burst Start Completion Turnaround Waiting
--------------------------------------------------------
A 6 0 6 6 0
B 8 6 14 14 6
C 2 14 16 16 14
--------------------------------------------------------
Average waiting = (0 + 6 + 14) / 3 = 20 / 3 ≈ 6.67
Average turnaround = (6 + 14 + 16) / 3 = 36 / 3 = 12.0
Poor little C only needed the CPU for 2 units, yet it waited 14 — stuck behind the two big jobs. That
is the famous convoy effect: short jobs pile up behind a long one like cars stuck
behind a slow truck.
Order 2 — SJF (run the shortest first: C, A, B)
Same jobs, same arrival, but now we serve them shortest-first — C (2), then A (6), then
B (8):
Job Burst Start Completion Turnaround Waiting
--------------------------------------------------------
C 2 0 2 2 0
A 6 2 8 8 2
B 8 8 16 16 8
--------------------------------------------------------
Average waiting = (0 + 2 + 8) / 3 = 10 / 3 ≈ 3.33
Average turnaround = (2 + 8 + 16) / 3 = 26 / 3 ≈ 8.67
Same jobs, same machine — but the average waiting time dropped from
6.67 to 3.33, cut in half, just by reordering.
By clearing the small job first, everybody behind it starts sooner. This is no accident:
-
For a fixed set of jobs available at the same time, Shortest-Job-First gives the minimum
possible average waiting time of any scheduler. No ordering does better.
-
The intuition: a job's burst adds to the wait of every job behind it, so you pay for a
long job many times over. Doing short jobs first makes those multiplied delays as small as they
can be.
Seeing it: a Gantt chart
A Gantt chart is the scheduler's timeline — a bar showing which job owns the CPU
during each stretch of time, laid end to end. Here is the FCFS schedule above (A, then B, then C)
drawn on a time axis. Step through it to watch each job take its turn:
Read left to right = forward in time. The width of each block is its burst time, and where a block
starts is exactly when that job begins running — so the left edge of C sitting at time 14 is
C's long, unfair wait, drawn to scale.
Compute it yourself
Here is the FCFS calculation as a tiny program. It takes a list of burst times (all arriving at time
0), lays them out in order, and prints each job's waiting and turnaround time plus the averages.
Change the bursts array and press Run to see how the order and the
sizes change the averages — try sorting them smallest-first to turn FCFS into SJF:
// FCFS scheduler: jobs all arrive at time 0, run in array order.
const bursts: number[] = [6, 8, 2]; // burst time of each job
const names: string[] = ["A", "B", "C"];
let clock = 0; // when the CPU becomes free
let totalWait = 0;
let totalTurn = 0;
console.log("Job Burst Start Finish Turnaround Waiting");
console.log("-----------------------------------------------");
for (let i = 0; i < bursts.length; i++) {
const start = clock; // this job waits until the CPU is free
const finish = start + bursts[i]; // then runs to completion
const waiting = start; // arrival is 0, so waiting = start time
const turnaround = finish; // arrival is 0, so turnaround = finish time
totalWait += waiting;
totalTurn += turnaround;
console.log(
names[i].padEnd(5) +
String(bursts[i]).padEnd(7) +
String(start).padEnd(7) +
String(finish).padEnd(8) +
String(turnaround).padEnd(12) +
String(waiting),
);
clock = finish; // CPU is busy until this job finishes
}
const n = bursts.length;
console.log("-----------------------------------------------");
console.log("Average waiting = " + (totalWait / n).toFixed(2));
console.log("Average turnaround = " + (totalTurn / n).toFixed(2));
Notice the two lines that do all the pedagogy: waiting = start and
turnaround = finish. Because everything arrives at 0, a job's wait is simply how long
the CPU was busy with the jobs ahead of it.
The trade-offs — why not always use SJF?
SJF wins on average waiting time, so why does no real OS use pure SJF? Because "best average" hides
two problems, and the other algorithms fix them:
-
SJF risks starvation. If short jobs keep arriving, a long job may never
reach the front — it is always shoved aside by something shorter. Its average was great; that one
job's experience was catastrophic. (Real systems fix this with aging: a job's priority
slowly rises the longer it waits.)
-
SJF needs to know the future. You cannot know a job's burst time before it runs.
Real schedulers estimate it from recent history — an educated guess, not a certainty.
-
FCFS suffers the convoy effect. Dead simple and perfectly fair on arrival order,
but one long job at the front makes everyone behind it crawl — terrible for responsiveness.
-
Round-Robin is fair and responsive. By slicing time into quanta, no job waits
longer than one lap of the queue before getting a turn. Interactive jobs feel instant. The price is
slightly worse average turnaround, because everyone is nibbled at rather than finished off.
There is no universal winner: batch systems that value throughput lean toward SJF; interactive
systems that value responsiveness lean toward Round-Robin; and real operating systems blend several
ideas (multi-level feedback queues) to get the best of each.
A very common slip: thinking that a job's waiting time includes its own burst. It
does not. Waiting time counts only the time spent sitting in the queue not running —
that's why we subtract the burst: T_{\text{wait}} = T_{\text{turnaround}} -
t_{\text{burst}}. The very first job in FCFS (arriving at 0) has a waiting time of
zero, even though it runs for its whole burst. It never waited for anyone; it just
worked.
Round-Robin's fairness comes from its time quantum — so surely a smaller quantum is always
better, giving everyone turns even sooner? No! Every time the CPU switches from one job to another
it performs a context switch: saving one job's registers and loading the next's.
That takes real time and does no useful work.
With a tiny quantum the CPU spends more of its life switching than computing — the
overhead dominates and throughput collapses. With a huge quantum, Round-Robin just degenerates back
into FCFS (each job finishes within its slice). The art is a quantum large enough that a typical
burst finishes in a few slices, but small enough that interactive jobs still feel instant — usually
a few milliseconds.