Right now your
This is the central problem of input/output. The CPU is a sprinter; the devices it talks to move like glaciers. If the fast thing has to wait around for the slow thing, you have thrown away billions of instructions' worth of work. The whole art of I/O is: how does a fast CPU deal with slow devices without wasting all its time waiting? That one question is what this page is about, and the answer — interrupts — is one of the most important ideas in how a computer actually works.
The CPU never wires straight into a keyboard or a disk platter. Between them sits a device controller — a little piece of dedicated hardware that speaks the device's language on one side and the CPU's language on the other. The controller exposes a handful of device registers: tiny storage slots that the CPU can read and write to give commands and check on progress. Typically there are three kinds:
How does the CPU reach those registers? On almost all modern machines the trick is memory-mapped I/O: each device register is assigned an ordinary memory address. To send a command, the CPU just does a normal store to that address; to read the status, a normal load. The device controller watches the address bus and answers when its addresses appear. So talking to a printer looks, in the instruction stream, exactly like writing to memory — no special I/O instructions needed.
Suppose the CPU has asked the disk for a block of data. The data will arrive — but not for a while. There are two fundamentally different strategies for finding out when it is ready.
Polling (busy-waiting). The CPU sits in a tight loop, reading the status register over and over, asking "ready yet? ready yet? ready yet?" until the flag flips:
It works, and it is simple, but look at the cost: while the device is idle the CPU burns every single cycle asking a question whose answer is "no". Millions of instructions spent achieving nothing. It is like standing at the toaster refusing to do anything else until the toast pops.
Interrupt-driven I/O. Instead, the CPU says "start the read, and tap me on the shoulder when you're done" — then goes off and runs other programs. When the data is ready, the device raises an interrupt: an electrical signal (an IRQ, interrupt request) that yanks the CPU's attention away, just long enough to deal with the device, after which the CPU carries on exactly where it left off. The toast now shouts "I'm done!" and you spend the waiting time doing the dishes.
The efficiency argument is decisive. If a device takes
A very natural mistake: "interrupts must be slower than polling — there's all that saving and restoring and jumping around." Per event, an interrupt genuinely does cost more than a single status check. But that misses the point entirely. Polling doesn't do one check — it does millions, back to back, the whole time the device is idle, and every one of them is wasted. The interrupt pays its overhead once and leaves the CPU free the rest of the time.
The honest exception: if a device is almost always ready instantly (very fast, very high throughput), the interrupt overhead can dominate, and a short spin of polling wins. Real systems sometimes do both — spin briefly, then fall back to interrupts. But for the slow devices that dominate everyday computing, interrupt-driven I/O is the clear winner.
When an interrupt fires, the CPU performs a precise little dance. The key insight is that the interrupt arrives at an unpredictable moment, in the middle of some other program — so the hardware must be able to duck out, handle the device, and return so seamlessly that the interrupted program never even notices. Step through the cycle:
The heart of it is save context → run the handler → restore context. The context is the CPU's registers and program counter — the exact "where was I". Because it is saved and restored perfectly, the interrupted program resumes as if nothing happened, blissfully unaware it was ever paused. The code that actually deals with the device is the Interrupt Service Routine (ISR), also called the interrupt handler.
How does the CPU know which handler to run? Each interrupt type has a number, and the CPU looks that number up in the interrupt vector table — an array in memory holding the address of each device's ISR. Keyboard interrupt? Look up entry, say, 1; find the address of the keyboard handler; jump there. It is exactly a lookup table from "which device shouted" to "which code handles it".
Interrupts solve the waiting problem, but there's a second problem hiding underneath. So far, whenever a byte is ready, the CPU itself copies it from the device register into memory. That's called programmed I/O, and for a single keystroke it's fine. But imagine reading a 4 KB disk block, one word at a time: the device signals, the CPU copies a word, the device signals, the CPU copies a word… thousands of times. Even with interrupts, the CPU is now a glorified bucket brigade, doing nothing but shovelling bytes between two places. That's a colossal waste of a fast processor.
Direct Memory Access (DMA) is the fix. There is a dedicated
DMA controller whose entire job is to move blocks of data between a device and
memory on its own. The CPU sets up the transfer once — "read 4 KB from the disk into memory
starting at address
Count the interrupts. Copying
The name "Direct Memory Access" trips people up. It does not mean the CPU reaches into memory more directly, and it does not mean the CPU does the copy. It's the opposite: the DMA controller — a separate device — accesses memory directly, instead of the CPU. The whole point is to take the copying job off the CPU. During a DMA transfer the CPU can run ordinary program code; it is notified just once, at the end.
They share the same memory bus, so they can't both use it at the exact same instant. The DMA controller "steals" bus cycles when it needs them — a technique literally called cycle stealing. Each steal delays the CPU by a whisker, but the CPU keeps running between steals. The net effect is still an enormous win: a tiny, occasional slowdown instead of the CPU being fully occupied copying every single word by hand.
Here is the whole story in one breath. A device is slow, so the CPU doesn't wait for it (interrupts instead of polling). A transfer is big, so the CPU doesn't copy it (DMA instead of programmed I/O). In both cases the design principle is identical: don't make the fast, expensive CPU wait on or babysit the slow, cheap device. Set the work in motion, walk away, and get one tap on the shoulder when it's done.
A modern disk read is all three ideas at once: the CPU issues the command through memory-mapped registers, a DMA controller streams the block straight into RAM, and a single interrupt signals completion — all while the CPU happily runs other programs.