Addressing Modes

A machine-code instruction is tiny: usually just an opcode (what to do — load, add, store) and an operand (what to do it to). But here is the subtle part that trips up almost everyone the first time: the same operand can mean completely different things. The number written in the operand field might be the value you want, or it might be a signpost pointing at where the value lives — or a signpost pointing at another signpost.

The rule that tells the CPU how to interpret the operand is called the addressing mode. It is baked into the instruction itself, so when the CPU decodes the instruction it knows exactly which interpretation to use. Choosing the right addressing mode is one of the things that makes assembly programs flexible, compact, and — as we'll see with arrays — fast.

A-level specifications (AQA, OCR) expect you to know four modes: immediate, direct, indirect, and indexed. We'll take a single operand — the number 20 — and watch it resolve to four different values, purely because we changed the mode.

The same operand, four meanings

Below is a small slice of main memory: addresses 20 to 25, each holding a value. Off to the side the CPU has an index register holding 3. Now take one instruction whose operand field contains 20, and step through the four addressing modes. Watch the operand 20 resolve to a different result each time — 20, then 25, then 99, then 51 — even though the bits in the operand field never changed.

The punchline: the operand field is identical in all four cases. What differs is how many times the CPU follows the pointer. Immediate follows it zero times (the operand is the value); direct follows it once; indirect follows it twice; indexed does a small sum first, then follows it once.

What each mode does

For an instruction whose operand field holds the number n:

In assembly, the mode is often shown by punctuation around the operand. Using a made-up but typical syntax (a # for immediate, brackets for indirect, a ,X for indexed), loading the accumulator four different ways looks like this:

LDA #20 ; IMMEDIATE : accumulator = 20 (the operand itself) LDA 20 ; DIRECT : accumulator = [20] = 25 (contents of address 20) LDA (20) ; INDIRECT : accumulator = [[20]] = 99 (follow the pointer twice) LDA 20,X ; INDEXED : accumulator = [20 + X] = 51 (X = 3, so address 23)

Four lines, one operand, four different results. Only the addressing mode — the punctuation — has changed.

Why indexed mode makes loops fast

Suppose you store an array of exam marks starting at address 20: mark 0 at 20, mark 1 at 21, mark 2 at 22, and so on. To add them all up you want a loop that reads the next element each time round. Without indexed mode you'd have to rewrite the address inside the instruction on every pass — clumsy and slow.

Indexed mode solves it elegantly. Keep the base (20) fixed in the instruction, and put the element number in the index register X. Each time round the loop you just add 1 to X, and the effective address 20 + X automatically walks along the array:

LDX #0 ; index register X = 0 (start at element 0) LDA #0 ; running total = 0 loop: ADD 20,X ; total = total + [20 + X] <-- indexed access INX ; X = X + 1 (move to the next element) CMP #5 ; done all 5 marks? BNE loop ; if not, go round again

One short loop reads every element of the array, and the only thing that changes is the index register. That single idea — a fixed base plus a moving index — is why every real processor has an indexed addressing mode, and it's the machine-level foundation of the array indexing (marks[i]) you write in high-level languages.

The classic exam mistake is confusing immediate with direct. They look almost identical on the page:

The little # is doing all the work. Miss it and you've read a memory location instead of using a literal number — a bug that is maddening to find.

And don't stop one level too early on indirect. It adds one more lookup than direct: direct gives you the contents of address 20 (which is 25); indirect treats that 25 as another address and goes there too, giving 99. If your answer for indirect is the same as your answer for direct, you forgot the second hop.

One extra lookup sounds like a waste — why not just put the real address in the operand? Because the operand field is fixed when the program is written, but an indirect pointer can be changed while the program runs. That lets a single instruction access a location decided at run-time — the basis of pointers and references in languages like C. It also lets you reach memory addresses too big to fit in a small operand field: the tiny operand names a full-width location that holds the real, large address. Indirection is one of computing's most powerful ideas — David Wheeler's famous quip is that "all problems in computer science can be solved by another level of indirection."