File Systems

Peel back every folder, every document, every photo on your computer and you find a shocking truth: the disk underneath knows nothing about any of it. A disk is a vast, dumb array of numbered blocks — millions of identical little boxes, each a few kilobytes wide, addressed only as block 0, block 1, block 7\,428\,193, and so on. It cannot tell you where your holiday photos are. It cannot tell you which blocks belong together, or which are free, or that a run of bytes is called notes.txt. It just stores whatever number you write into whatever box you name.

The file system is the layer of the operating system that turns that chaos into something humans can use. It is the grand bookkeeper that lets you say "open the file called /home/ada/notes.txt" and quietly translates that into "read blocks 88, 89 and 4,102, in that order." This one idea — how the OS organises raw numbered blocks into named files and directories — is what this page is about.

Just as memory management gives each program a tidy, private view of scattered physical RAM, the file system gives you a tidy, named view of scattered physical disk blocks. Same trick, different hardware: a clean abstraction laid over messy storage.

What is a file, really?

Strip away the icon and the fancy name, and a file is just two things:

Notice what is not in that list: the file's name. That feels wrong at first — surely a file's name belongs to the file? It turns out the name lives somewhere else entirely, in the directory, and pulling those two apart is the secret that makes the whole system flexible. We will get there. First, how do we ever find a file by name at all?

Directories: a tree of names

A directory (what you probably call a folder) is itself just a special file — but instead of holding your document's bytes, it holds a little table. Each row of that table is a directory entry: a name paired with a number that points at where the real information about that file lives. A directory can list ordinary files and other directories, and because directories can nest inside directories, the whole collection forms a tree.

The tree has a single root, written /. Every file on the system hangs somewhere below it, and the chain of names from the root down to a file — separated by slashes — is its path, like /home/ada/notes.txt. Reading a path is a walk down the tree, one name at a time. This is called path resolution:

  1. Start at the root directory /. Look inside it for an entry named home.
  2. Found it — it points to a directory. Open that directory and look for an entry named ada.
  3. Found it — another directory. Open it and look for an entry named notes.txt.
  4. Found it — this one points to an ordinary file. Stop: that is the file we wanted.

Every slash in a path is one hop down the tree. The file system does this walk every single time you open a file — which is why deeply nested paths take (a tiny bit) more work to resolve than shallow ones, and why the OS keeps recently-visited directories cached in memory.

Look inside almost any directory and you find two entries you never created: . (dot) and .. (dot-dot). They are ordinary directory entries the file system adds for free. . points at this same directory, and .. points at its parent. That is literally how cd .. climbs the tree: it just follows the .. entry back up one level. The tree isn't only walkable downward — every node carries a signpost home.

The inode: a file's real identity

So a directory entry gives us a name and a number. That number is the inode number, and it is the key to everything. An inode (short for index node) is a small, fixed-size record — the file system keeps a big array of them — and the inode is the file, as far as the operating system is concerned. It holds all the metadata and, crucially, the map to the data.

An inode stores:

Here is the elegant division of labour. The directory entry holds the name and the inode number. The inode holds the metadata and the block pointers — but not the name. The bytes live in the data blocks the inode points to. Three separate things, each with one job:

directory entry inode (the "file") data blocks ------------------- ---------------------- -------------- "notes.txt" -> 42 #42: size, owner, perms, block 88: "Dear ada," timestamps, link count block 89: " here are" pointers -> 88, 89, 4102 block 4102: " my notes"

Because the name is kept apart from the inode, two different names can point at the very same inode. That is a hard link: two directory entries, one file. The link count in the inode tallies how many names currently point at it, and the file's data is freed only when that count falls to zero.

Following the pointers: indexed allocation

Let's watch an inode actually locate a file's bytes. The inode holds a handful of direct pointers, each naming one data block on the disk. Those blocks can sit anywhere — the file system scatters them wherever it found free space, and the pointers stitch them back into order. Step through the figure to see the inode's pointers reach out to the scattered blocks:

With, say, 12 direct pointers and 4\text{ KB} blocks, an inode can address 12 \times 4\text{ KB} = 48\text{ KB} directly — fine for small files, but hopeless for a video. So the last pointer plays a trick: instead of pointing at data, a single-indirect pointer points at a whole block full of more pointers. If a block holds 1024 pointers, that one indirect block buys another 1024 \times 4\text{ KB} = 4\text{ MB} of file. Need still more? A double-indirect pointer points at a block of indirect blocks, and a triple-indirect pointer adds one more level — enough to address enormous files while keeping tiny files cheap.

Three ways to lay a file on disk

Indexed allocation is one answer to a deeper question: given a file's blocks, how do you record which blocks they are, and in what order? There are three classic strategies, and the trade-offs between them shaped decades of file-system design.

1 · Contiguous allocation

Store the file's blocks back to back, in one unbroken run. The inode then needs only two numbers: the start block and the length. Reading the file is blindingly fast — the disk head sweeps straight through with no jumping around — and random access is trivial (byte N is at start-block plus N / \text{blocksize}).

2 · Linked allocation

Let the blocks live anywhere, and have each block store a pointer to the next one, like a treasure hunt. The inode holds only the address of the first block; follow the chain to the end. Fragmentation vanishes — any free block will do — and files grow freely by tacking another block onto the chain.

3 · Indexed allocation

Collect all of a file's block addresses into an index — the inode, with its direct and indirect pointers, exactly as above. This is the modern answer, and it deliberately blends the best of the other two.

Here is the whole comparison at a glance:

strategy random access? fragmentation grow a file? metadata ----------- -------------- ---------------- ------------ ------------------ contiguous instant external (holes) hard start + length linked no (walk chain) none easy first block only indexed yes (via index) none easy the inode's pointers

Bigger blocks mean fewer pointers per file — tempting. But blocks are the smallest unit the file system will allocate, so a one-byte file still swallows a whole block. With 64\text{ KB} blocks, a million tiny files waste tens of gigabytes to this internal fragmentation. Block size is a tug-of-war: large blocks stream big files fast but bloat small ones; small blocks pack tightly but need more pointers and more seeks. Most systems settle around 4\text{ KB} as a sensible compromise.

Two myths, dispelled

Myth: "the filename is stored inside the file's inode." It is not. The name lives in the directory entry; the inode holds only metadata and block pointers. This is not a pedantic detail — it is why hard links exist. Because the name and the file are separate, several directory entries (several names, possibly in different folders) can point at one and the same inode, and every one of them is an equally "real" name for the file. Rename one, and the others are untouched, because you only edited a directory entry, never the file itself.

Usually no. When you delete a file, the file system typically removes the directory entry and decrements the inode's link count. If the count hits zero, it marks the inode and its data blocks as free — but it does not scrub the old bytes; it just lets future files overwrite them eventually. That is exactly why deleted files can often be recovered, and why truly wiping a disk needs a special tool that overwrites the blocks on purpose. Freeing space and erasing data are two different acts.