Peel back every folder, every document, every photo on your computer and you find a shocking truth:
the disk underneath knows nothing about any of it. A disk is a vast, dumb array of
numbered blocks — millions of identical little boxes, each a few kilobytes wide,
addressed only as block notes.txt. It just stores whatever number you write into whatever box you name.
The file system is the layer of the operating system that turns that chaos into
something humans can use. It is the grand bookkeeper that lets you say "open the file called
/home/ada/notes.txt" and quietly translates that into "read blocks 88, 89 and
4,102, in that order." This one idea — how the OS organises raw numbered blocks into
named files and directories — is what this page is about.
Just as
Strip away the icon and the fancy name, and a file is just two things:
Notice what is not in that list: the file's name. That feels wrong at first — surely a file's name belongs to the file? It turns out the name lives somewhere else entirely, in the directory, and pulling those two apart is the secret that makes the whole system flexible. We will get there. First, how do we ever find a file by name at all?
A directory (what you probably call a folder) is itself just a special file — but instead of holding your document's bytes, it holds a little table. Each row of that table is a directory entry: a name paired with a number that points at where the real information about that file lives. A directory can list ordinary files and other directories, and because directories can nest inside directories, the whole collection forms a tree.
The tree has a single root, written /. Every file on the system hangs somewhere below
it, and the chain of names from the root down to a file — separated by slashes — is its
path, like /home/ada/notes.txt. Reading a path is a walk down the
tree, one name at a time. This is called path resolution:
/. Look inside it for an entry named home.ada.notes.txt.Every slash in a path is one hop down the tree. The file system does this walk every single time you open a file — which is why deeply nested paths take (a tiny bit) more work to resolve than shallow ones, and why the OS keeps recently-visited directories cached in memory.
Look inside almost any directory and you find two entries you never created: . (dot)
and .. (dot-dot). They are ordinary directory entries the file system adds for free.
. points at this same directory, and .. points at its
parent. That is literally how cd .. climbs the tree: it just follows
the .. entry back up one level. The tree isn't only walkable downward — every node
carries a signpost home.
So a directory entry gives us a name and a number. That number is the inode number, and it is the key to everything. An inode (short for index node) is a small, fixed-size record — the file system keeps a big array of them — and the inode is the file, as far as the operating system is concerned. It holds all the metadata and, crucially, the map to the data.
An inode stores:
Here is the elegant division of labour. The directory entry holds the name and the inode number. The inode holds the metadata and the block pointers — but not the name. The bytes live in the data blocks the inode points to. Three separate things, each with one job:
Because the name is kept apart from the inode, two different names can point at the very same inode. That is a hard link: two directory entries, one file. The link count in the inode tallies how many names currently point at it, and the file's data is freed only when that count falls to zero.
Let's watch an inode actually locate a file's bytes. The inode holds a handful of direct pointers, each naming one data block on the disk. Those blocks can sit anywhere — the file system scatters them wherever it found free space, and the pointers stitch them back into order. Step through the figure to see the inode's pointers reach out to the scattered blocks:
With, say,
Indexed allocation is one answer to a deeper question: given a file's blocks, how do you record which blocks they are, and in what order? There are three classic strategies, and the trade-offs between them shaped decades of file-system design.
Store the file's blocks back to back, in one unbroken run. The inode then needs only
two numbers: the start block and the length. Reading the file is blindingly fast —
the disk head sweeps straight through with no jumping around — and random access is trivial (byte
Let the blocks live anywhere, and have each block store a pointer to the next one, like a treasure hunt. The inode holds only the address of the first block; follow the chain to the end. Fragmentation vanishes — any free block will do — and files grow freely by tacking another block onto the chain.
Collect all of a file's block addresses into an index — the inode, with its direct and indirect pointers, exactly as above. This is the modern answer, and it deliberately blends the best of the other two.
Here is the whole comparison at a glance:
Bigger blocks mean fewer pointers per file — tempting. But blocks are the smallest unit the file
system will allocate, so a one-byte file still swallows a whole block. With
Myth: "the filename is stored inside the file's inode." It is not. The name lives in the directory entry; the inode holds only metadata and block pointers. This is not a pedantic detail — it is why hard links exist. Because the name and the file are separate, several directory entries (several names, possibly in different folders) can point at one and the same inode, and every one of them is an equally "real" name for the file. Rename one, and the others are untouched, because you only edited a directory entry, never the file itself.
Usually no. When you delete a file, the file system typically removes the directory entry and decrements the inode's link count. If the count hits zero, it marks the inode and its data blocks as free — but it does not scrub the old bytes; it just lets future files overwrite them eventually. That is exactly why deleted files can often be recovered, and why truly wiping a disk needs a special tool that overwrites the blocks on purpose. Freeing space and erasing data are two different acts.