Flat-file vs Relational Databases

A school needs to keep track of its pupils: their name, which class they are in, and who teaches that class. Where do you put all of that? The obvious first answer is one big table — like a single giant spreadsheet with a row for every pupil and a column for every fact. That is exactly what a flat-file database is: all the data lives in one table.

It sounds tidy, and for a tiny amount of data it is fine. But as soon as the same facts start turning up again and again, a flat file quietly turns into a mess. This page is about why that happens, and how a relational database — data split across several linked tables — fixes it. Understanding this one trade-off is the heart of GCSE databases.

A flat file: everything in one table

Here is our school as a single flat-file table. Read it row by row — each pupil, their class, and their teacher:

Pupils (flat file)
Name	Class	Teacher
Priya	10A	Mr Okafor
Sam	10A	Mr Okafor
Ada	10A	Mr Okafor
Leo	10B	Ms Bell
Maya	10C	Dr Rai
Ben	10A	Mr Okafor

Look at the highlighted Teacher column. “Mr Okafor” is written out four separate times, once for every pupil in 10A. Nothing there is wrong, exactly — but the same fact (“10A is taught by Mr Okafor”) is stored over and over. This repetition of identical data is called data redundancy.

Why redundancy is a problem

Repeating the same fact on many rows causes two real headaches:

Wasted space. Every extra copy of “Mr Okafor” takes up storage. With six pupils it is trivial, but a real school has hundreds of pupils and a company might have millions of rows — all those duplicated names, addresses and phone numbers add up fast.
Inconsistency. This is the dangerous one. If a fact is stored in many places and you change it in only some of them, your data starts to disagree with itself. The database is now telling you two different things, and no one knows which is true.

Suppose Mr Okafor leaves and Ms Bell takes over class 10A. In the flat file you must find every single 10A row and change the teacher — four edits for four pupils. Change three of them and get distracted, and the table now claims that some 10A pupils are taught by Ms Bell and others by Mr Okafor. Same class, two answers. That contradiction is called an update anomaly, and it is the classic reason flat files go wrong: a fact stored many times must be updated in many places, and humans miss some.

The cure is simple to state: store each fact exactly once. Then there is only ever one place to change it, and the data can never disagree with itself. That single idea is what relational databases are built to deliver.

You can watch an update anomaly happen. This flat-file list tries to change 10A's teacher but only fixes the first matching row — press Run:

// A flat file: the teacher's name is COPIED onto every pupil's row. type Row = { pupil: string; classId: string; teacher: string }; const flatFile: Row[] = [ { pupil: "Priya", classId: "10A", teacher: "Mr Okafor" }, { pupil: "Sam", classId: "10A", teacher: "Mr Okafor" }, { pupil: "Ada", classId: "10A", teacher: "Mr Okafor" }, ]; // 10A gets a new teacher... but we only remember to fix the first row. flatFile[0].teacher = "Ms Bell"; for (const row of flatFile) { console.log(row.pupil + " (10A) is taught by " + row.teacher); } // The data now disagrees with itself: that is an update anomaly.

The relational fix: split into linked tables

A relational database refuses to repeat facts. Instead of one table, it uses several tables, each about one kind of thing, and it links them together. Our school splits cleanly into two:

Classes
ClassID	Teacher
10A	Mr Okafor
10B	Ms Bell
10C	Dr Rai

Students
Name	ClassID
Priya	10A
Sam	10A
Ada	10A
Leo	10B
Maya	10C
Ben	10A

Now “Mr Okafor” is written once, in the Classes table. The Students table doesn't repeat the teacher's name — it just stores the short ClassID (like 10A) that points to the matching row in Classes. That shared column is a key: it is how the two tables are joined back together when you need the full picture.

Step through the diagram to see the link:

Keys: how tables connect

Two special columns make the linking work. You need to recognise both for GCSE:

A primary key is a column whose value is different on every row, so it identifies that row uniquely. In the Classes table, ClassID is the primary key — no two classes share it.
A foreign key is a column in one table that holds the primary key of another table, creating the link. In the Students table, ClassID is a foreign key pointing at the Classes table.

Follow a student's ClassID across to the Classes table and you instantly find their teacher — no duplication needed. Change the teacher for 10A? You edit one cell, and every 10A pupil is automatically up to date, because they were never storing the teacher's name in the first place. The update anomaly simply cannot happen.

Yes! If your data is small and simple — a personal list of birthdays, a shopping list, the high scores in one game — a flat file (a single spreadsheet or CSV) is quick to set up and perfectly fine. The problems only bite once the same facts repeat across many rows and the data needs updating. As a rule of thumb: one kind of thing, little repetition ⇒ flat file is fine; several related kinds of things, lots of shared facts ⇒ go relational.

At a glance

Flat-file vs relational
Feature	Flat file	Relational
Tables	One big table	Several linked tables
How facts are stored	Repeated on every relevant row	Each fact stored once
Data redundancy	High	Low
Risk of inconsistency	High (update anomalies)	Low
Links between tables	None	Primary & foreign keys
Best for	Small, simple data	Large, related data