File Systems

Peel back every folder, every document, every photo on your computer and you find a shocking truth: the disk underneath knows nothing about any of it. A disk is a vast, dumb array of numbered blocks — millions of identical little boxes, each a few kilobytes wide, addressed only as block 0, block 1, block 7\,428\,193, and so on. It cannot tell you where your holiday photos are. It cannot tell you which blocks belong together, or which are free, or that a run of bytes is called notes.txt. It just stores whatever number you write into whatever box you name.

The file system is the layer of the operating system that turns that chaos into something humans can use. It is the grand bookkeeper that lets you say "open the file called /home/ada/notes.txt" and quietly translates that into "read blocks 88, 89 and 4,102, in that order." This one idea — how the OS organises raw numbered blocks into named files and directories — is what this page is about.

Just as memory management gives each program a tidy, private view of scattered physical RAM, the file system gives you a tidy, named view of scattered physical disk blocks. Same trick, different hardware: a clean abstraction laid over messy storage.

What is a file, really?

Strip away the icon and the fancy name, and a file is just two things:

a named sequence of bytes — the actual contents, from the first byte to the last. The file system does not care whether those bytes are text, a JPEG, or machine code; to it, a file is simply "these bytes, in this order."
a bundle of metadata — the facts about the file that are not part of its contents: its size, who owns it, the permissions (who may read or write it), and the timestamps (created, last modified, last accessed).

Notice what is not in that list: the file's name. That feels wrong at first — surely a file's name belongs to the file? It turns out the name lives somewhere else entirely, in the directory, and pulling those two apart is the secret that makes the whole system flexible. We will get there. First, how do we ever find a file by name at all?

Directories: a tree of names

A directory (what you probably call a folder) is itself just a special file — but instead of holding your document's bytes, it holds a little table. Each row of that table is a directory entry: a name paired with a number that points at where the real information about that file lives. A directory can list ordinary files and other directories, and because directories can nest inside directories, the whole collection forms a tree.

The tree has a single root, written /. Every file on the system hangs somewhere below it, and the chain of names from the root down to a file — separated by slashes — is its path, like /home/ada/notes.txt. Reading a path is a walk down the tree, one name at a time. This is called path resolution:

Start at the root directory /. Look inside it for an entry named home.
Found it — it points to a directory. Open that directory and look for an entry named ada.
Found it — another directory. Open it and look for an entry named notes.txt.
Found it — this one points to an ordinary file. Stop: that is the file we wanted.

Every slash in a path is one hop down the tree. The file system does this walk every single time you open a file — which is why deeply nested paths take (a tiny bit) more work to resolve than shallow ones, and why the OS keeps recently-visited directories cached in memory.

Look inside almost any directory and you find two entries you never created: . (dot) and .. (dot-dot). They are ordinary directory entries the file system adds for free. . points at this same directory, and .. points at its parent. That is literally how cd .. climbs the tree: it just follows the .. entry back up one level. The tree isn't only walkable downward — every node carries a signpost home.

The inode: a file's real identity

So a directory entry gives us a name and a number. That number is the inode number, and it is the key to everything. An inode (short for index node) is a small, fixed-size record — the file system keeps a big array of them — and the inode is the file, as far as the operating system is concerned. It holds all the metadata and, crucially, the map to the data.

An inode stores:

the file's size in bytes;
the owner and the permission bits (read / write / execute, for the owner, the group, and everyone);
the timestamps — created, modified, accessed;
a link count (how many directory entries point at this inode — more on that shortly);
and the part that makes a file usable: a set of pointers to the data blocks that hold the actual bytes.

Here is the elegant division of labour. The directory entry holds the name and the inode number. The inode holds the metadata and the block pointers — but not the name. The bytes live in the data blocks the inode points to. Three separate things, each with one job:

directory entry inode (the "file") data blocks ------------------- ---------------------- -------------- "notes.txt" -> 42 #42: size, owner, perms, block 88: "Dear ada," timestamps, link count block 89: " here are" pointers -> 88, 89, 4102 block 4102: " my notes"

Because the name is kept apart from the inode, two different names can point at the very same inode. That is a hard link: two directory entries, one file. The link count in the inode tallies how many names currently point at it, and the file's data is freed only when that count falls to zero.

Following the pointers: indexed allocation

Let's watch an inode actually locate a file's bytes. The inode holds a handful of direct pointers, each naming one data block on the disk. Those blocks can sit anywhere — the file system scatters them wherever it found free space, and the pointers stitch them back into order. Step through the figure to see the inode's pointers reach out to the scattered blocks:

With, say, 12 direct pointers and 4\text{ KB} blocks, an inode can address 12 \times 4\text{ KB} = 48\text{ KB} directly — fine for small files, but hopeless for a video. So the last pointer plays a trick: instead of pointing at data, a single-indirect pointer points at a whole block full of more pointers. If a block holds 1024 pointers, that one indirect block buys another 1024 \times 4\text{ KB} = 4\text{ MB} of file. Need still more? A double-indirect pointer points at a block of indirect blocks, and a triple-indirect pointer adds one more level — enough to address enormous files while keeping tiny files cheap.

An inode holds a fixed number of direct pointers, each to one data block.
A single-indirect pointer points to a block that holds nothing but more pointers, multiplying the reach.
Double- and triple-indirect pointers add further levels, so small files stay cheap while huge files remain reachable.

Three ways to lay a file on disk

Indexed allocation is one answer to a deeper question: given a file's blocks, how do you record which blocks they are, and in what order? There are three classic strategies, and the trade-offs between them shaped decades of file-system design.

1 · Contiguous allocation

Store the file's blocks back to back, in one unbroken run. The inode then needs only two numbers: the start block and the length. Reading the file is blindingly fast — the disk head sweeps straight through with no jumping around — and random access is trivial (byte N is at start-block plus N / \text{blocksize}).

Pro: fastest possible sequential and random reads; tiny metadata.
Con: you must know the file's final size up front, and free space fragments into holes. A file cannot grow if another file butts up against it, and eventually the disk is a patchwork of gaps too small to be useful — external fragmentation.

2 · Linked allocation

Let the blocks live anywhere, and have each block store a pointer to the next one, like a treasure hunt. The inode holds only the address of the first block; follow the chain to the end. Fragmentation vanishes — any free block will do — and files grow freely by tacking another block onto the chain.

Pro: no external fragmentation; files grow easily; blocks can be scattered anywhere.
Con: no random access — to reach block 500 you must read blocks 1 through 499 first, following each pointer. And one corrupted pointer orphans the entire rest of the file.

3 · Indexed allocation

Collect all of a file's block addresses into an index — the inode, with its direct and indirect pointers, exactly as above. This is the modern answer, and it deliberately blends the best of the other two.

Pro: supports fast random access (jump straight to the right pointer) and scattered blocks with no external fragmentation; files grow by adding pointers.
Con: the index itself takes space, and reaching a far-off block may cost an extra read or two to walk the indirect blocks.

Here is the whole comparison at a glance:

strategy random access? fragmentation grow a file? metadata ----------- -------------- ---------------- ------------ ------------------ contiguous instant external (holes) hard start + length linked no (walk chain) none easy first block only indexed yes (via index) none easy the inode's pointers

Bigger blocks mean fewer pointers per file — tempting. But blocks are the smallest unit the file system will allocate, so a one-byte file still swallows a whole block. With 64\text{ KB} blocks, a million tiny files waste tens of gigabytes to this internal fragmentation. Block size is a tug-of-war: large blocks stream big files fast but bloat small ones; small blocks pack tightly but need more pointers and more seeks. Most systems settle around 4\text{ KB} as a sensible compromise.

Two myths, dispelled

Myth: "the filename is stored inside the file's inode." It is not. The name lives in the directory entry; the inode holds only metadata and block pointers. This is not a pedantic detail — it is why hard links exist. Because the name and the file are separate, several directory entries (several names, possibly in different folders) can point at one and the same inode, and every one of them is an equally "real" name for the file. Rename one, and the others are untouched, because you only edited a directory entry, never the file itself.

Usually no. When you delete a file, the file system typically removes the directory entry and decrements the inode's link count. If the count hits zero, it marks the inode and its data blocks as free — but it does not scrub the old bytes; it just lets future files overwrite them eventually. That is exactly why deleted files can often be recovered, and why truly wiping a disk needs a special tool that overwrites the blocks on purpose. Freeing space and erasing data are two different acts.