Virtual Memory and Paging

You just bought a game that needs 32 GB of memory. Your laptop has 16 GB of RAM. You click play — and it runs. How? The game thinks it owns a vast, private, continuous stretch of memory addresses from 0 upward. That stretch is a comfortable fiction called the virtual address space, and keeping the fiction convincing is the job of virtual memory.

The trick that makes it work is paging. Instead of handing each program one giant contiguous block of real memory (which quickly leaves the free space in useless slivers — the external fragmentation problem), the operating system chops both the virtual address space and physical RAM into small, fixed-size blocks. A block of virtual space is a page; a block of physical RAM is a frame. Any page can live in any frame — and the pages you are not using right now don't even need to be in RAM. They wait, patiently, on the disk.

Pages and frames: same size, anywhere they like

Pick a page size — a very common choice is 4 KB (that's 4096 = 2^{12} bytes). Now:

the process's virtual address space is a numbered list of pages: page 0, page 1, page 2, …;
the machine's physical RAM is a numbered list of frames: frame 0, frame 1, …, each exactly one page big;
a page table records, for each page, which frame it currently sits in (or that it's out on disk).

Because every page and every frame is the same fixed size, a page slots into any free frame with no gaps left over. That is the death of external fragmentation: free RAM is just a pile of interchangeable frames, and there is always room if any frame is free at all.

Notice how scattered the frames are — page 0 landed in frame 3, page 1 in frame 0, page 2 in frame 5. The program has no idea and no reason to care: it sees a tidy run of virtual pages, and the page table quietly untangles the scramble underneath.

Address translation: split, look up, recombine

Every memory access your program makes uses a virtual address, and the hardware must translate it to a physical address before touching RAM. With fixed-size pages this is beautifully mechanical. Split the virtual address into two parts:

\text{virtual address} \;=\; \underbrace{\text{page number}}_{\text{high bits}} \;\Vert\; \underbrace{\text{offset}}_{\text{low bits}}

With a 4 KB page the low 12 bits are the offset (which byte within the page, 0 to 4095), and everything above them is the page number. So for a virtual address a:

\text{page} = \left\lfloor \tfrac{a}{4096} \right\rfloor = a \gg 12, \qquad \text{offset} = a \bmod 4096 = a\ \&\ \text{0xFFF}.

Worked example. Say the program reads virtual address a = 20{,}500.

Page number: \lfloor 20500 / 4096 \rfloor = 5 (since 5 \times 4096 = 20480).
Offset: 20500 - 20480 = 20 — the 20th byte inside the page.
Look up page 5 in the page table; suppose it maps to frame 9.
Physical address: 9 \times 4096 + 20 = 36{,}884. Done.

The offset never changes — a byte 20 places into page 5 is still 20 places into frame 9. Translation only ever rewrites the page number part into a frame number; the offset rides along untouched.

The TLB: a cache so we don't look up every time

There's a catch. If every memory access needs a page-table lookup, and the page table itself lives in RAM, then each access to memory has secretly become two accesses to memory — one to read the page table, one to read the data. That would roughly halve speed.

The fix is a tiny, blisteringly fast hardware cache called the Translation Lookaside Buffer (TLB). It remembers the last handful of page → frame translations. On each access:

TLB hit — the page's frame is already in the TLB. Translation is essentially free; go straight to RAM.
TLB miss — not cached. Now we do the full page-table walk to find the frame, and drop the result into the TLB so next time it's a hit.

Because real programs touch the same few pages over and over (this clustering is called locality), the TLB hit rate is typically well over 95%, and the two-lookup penalty almost never actually happens.

No — and mixing these up is the classic beginner slip. They are two different caches failing at two different levels:

a TLB miss just means the translation wasn't cached. The page may be perfectly resident in RAM — we simply have to walk the page table to find its frame. Cost: a few extra memory reads.
a page fault means the page isn't in RAM at all — it's out on disk. Cost: a disk fetch, which is millions of times slower.

Every page fault involves finding the page absent, but a TLB miss on its own is cheap and routine. Don't conflate them.

Page faults: when the page isn't there

Here is the payoff for the 32 GB-on-16 GB machine. A page table entry can be marked not present — the page exists in the program's virtual space but currently lives on disk, not in any frame. When the program touches such a page, the hardware raises a page fault, and the operating system steps in:

pause the program (it doesn't even notice);
find a free frame — or, if RAM is full, evict some other page to make room;
read the wanted page in from disk into that frame;
update the page table entry to present, pointing at the frame;
resume the program at the exact instruction that faulted, which now succeeds.

This is demand paging: a page is only ever brought into RAM the moment it's actually needed. A program far larger than physical memory runs fine as long as the pages it's using right now fit — the rest doze on disk. RAM becomes a cache for the disk.

Which page do we evict? FIFO vs LRU

When RAM is full and a new page must come in, the OS must throw one resident page out. The rule it uses is a page-replacement policy, and the goal is to minimise future page faults. Two classic policies:

FIFO (First-In, First-Out) — evict whichever page has been in RAM the longest, regardless of whether it's still useful. Simple, but can throw out a heavily-used page just because it arrived early.
LRU (Least Recently Used) — evict the page that has gone unused for the longest. This bets on locality (recently used pages will be used again soon) and usually faults less than FIFO, but it costs more to track.

Let's count faults for real. Below is a FIFO simulator over a fixed reference string (the sequence of pages a program touches) with a small number of frames. It prints HIT or FAULT for every access — showing which page gets evicted — and the total fault count at the end. Change the reference string or the frame count and press Run:

// FIFO page replacement over a reference string. const refString: number[] = [7, 0, 1, 2, 0, 3, 0, 4, 2, 3, 0, 3, 2, 1, 2, 0, 1, 7, 0, 1]; const numFrames = 3; // how many frames RAM has for this process const frames: number[] = []; // pages currently resident in RAM const arrivals: number[] = []; // FIFO order in which resident pages arrived let faults = 0; for (const page of refString) { if (frames.includes(page)) { console.log("access " + page + " -> HIT frames=[" + frames.join(", ") + "]"); continue; // already resident, nothing to do } faults++; // not resident: it's a page fault if (frames.length < numFrames) { // there is a free frame: just load the page frames.push(page); arrivals.push(page); console.log("access " + page + " -> FAULT load (free frame) frames=[" + frames.join(", ") + "]"); } else { // RAM is full: evict the oldest-arrived page (FIFO) const victim = arrivals.shift() as number; frames[frames.indexOf(victim)] = page; arrivals.push(page); console.log("access " + page + " -> FAULT evict " + victim + " frames=[" + frames.join(", ") + "]"); } } console.log(""); console.log("Total page faults: " + faults + " out of " + refString.length + " accesses");

Try dropping numFrames to 2, or setting it to 4 — more frames should mean fewer faults. Should. Read the next box before you bet on it.

You'd assume giving a process extra frames can only help — surely more RAM never hurts. Astonishingly, under FIFO it sometimes does. There exist reference strings where increasing the number of frames increases the number of page faults. This counter-intuitive result is called Bélády's anomaly.

A famous witness is the reference string 1,2,3,4,1,2,5,1,2,3,4,5: with FIFO it causes 9 faults on 3 frames but 10 faults on 4 frames. Plug it into the simulator above (with numFrames = 3 then 4) and watch it happen.

The lesson: FIFO doesn't track usefulness, only age, so more frames can reshuffle evictions for the worse. LRU (and other "stack" policies) provably never suffer this — for them, more frames can only help.

Putting it together

A process's virtual address space is split into fixed-size pages; RAM is split into equal-size frames. A page table maps each page to its frame.
Translating an address splits it into page number + offset; the page number is looked up to a frame number, the offset is left untouched. The TLB caches recent translations to keep this fast.
Pages not needed right now live on disk; touching one triggers a page fault that loads it in — so a process can be larger than physical RAM. When RAM is full, a replacement policy (FIFO, LRU, …) chooses what to evict.

Page size is a trade-off. Tiny pages waste almost no space at the end of the last page (little internal fragmentation) but need an enormous page table with an entry per page — and more page faults to stream a program in. Huge pages shrink the page table and let one fault load a lot, but every allocation rounds up to a whole page, wasting memory inside it. Around 4 KB has been the sweet spot for decades, though modern systems also offer optional 2 MB and 1 GB "huge pages" for memory-hungry programs.