Flat-file vs Relational Databases

A school needs to keep track of its pupils: their name, which class they are in, and who teaches that class. Where do you put all of that? The obvious first answer is one big table — like a single giant spreadsheet with a row for every pupil and a column for every fact. That is exactly what a flat-file database is: all the data lives in one table.

It sounds tidy, and for a tiny amount of data it is fine. But as soon as the same facts start turning up again and again, a flat file quietly turns into a mess. This page is about why that happens, and how a relational database — data split across several linked tables — fixes it. Understanding this one trade-off is the heart of GCSE databases.

A flat file: everything in one table

Here is our school as a single flat-file table. Read it row by row — each pupil, their class, and their teacher:

Pupils (flat file)
NameClassTeacher
Priya10AMr Okafor
Sam10AMr Okafor
Ada10AMr Okafor
Leo10BMs Bell
Maya10CDr Rai
Ben10AMr Okafor

Look at the highlighted Teacher column. “Mr Okafor” is written out four separate times, once for every pupil in 10A. Nothing there is wrong, exactly — but the same fact (“10A is taught by Mr Okafor”) is stored over and over. This repetition of identical data is called data redundancy.

Why redundancy is a problem

Repeating the same fact on many rows causes two real headaches:

Suppose Mr Okafor leaves and Ms Bell takes over class 10A. In the flat file you must find every single 10A row and change the teacher — four edits for four pupils. Change three of them and get distracted, and the table now claims that some 10A pupils are taught by Ms Bell and others by Mr Okafor. Same class, two answers. That contradiction is called an update anomaly, and it is the classic reason flat files go wrong: a fact stored many times must be updated in many places, and humans miss some.

The cure is simple to state: store each fact exactly once. Then there is only ever one place to change it, and the data can never disagree with itself. That single idea is what relational databases are built to deliver.

You can watch an update anomaly happen. This flat-file list tries to change 10A's teacher but only fixes the first matching row — press Run:

// A flat file: the teacher's name is COPIED onto every pupil's row. type Row = { pupil: string; classId: string; teacher: string }; const flatFile: Row[] = [ { pupil: "Priya", classId: "10A", teacher: "Mr Okafor" }, { pupil: "Sam", classId: "10A", teacher: "Mr Okafor" }, { pupil: "Ada", classId: "10A", teacher: "Mr Okafor" }, ]; // 10A gets a new teacher... but we only remember to fix the first row. flatFile[0].teacher = "Ms Bell"; for (const row of flatFile) { console.log(row.pupil + " (10A) is taught by " + row.teacher); } // The data now disagrees with itself: that is an update anomaly.

The relational fix: split into linked tables

A relational database refuses to repeat facts. Instead of one table, it uses several tables, each about one kind of thing, and it links them together. Our school splits cleanly into two:

Classes
ClassIDTeacher
10AMr Okafor
10BMs Bell
10CDr Rai
Students
NameClassID
Priya10A
Sam10A
Ada10A
Leo10B
Maya10C
Ben10A

Now “Mr Okafor” is written once, in the Classes table. The Students table doesn't repeat the teacher's name — it just stores the short ClassID (like 10A) that points to the matching row in Classes. That shared column is a key: it is how the two tables are joined back together when you need the full picture.

Step through the diagram to see the link:

Keys: how tables connect

Two special columns make the linking work. You need to recognise both for GCSE:

Follow a student's ClassID across to the Classes table and you instantly find their teacher — no duplication needed. Change the teacher for 10A? You edit one cell, and every 10A pupil is automatically up to date, because they were never storing the teacher's name in the first place. The update anomaly simply cannot happen.

Yes! If your data is small and simple — a personal list of birthdays, a shopping list, the high scores in one game — a flat file (a single spreadsheet or CSV) is quick to set up and perfectly fine. The problems only bite once the same facts repeat across many rows and the data needs updating. As a rule of thumb: one kind of thing, little repetition ⇒ flat file is fine; several related kinds of things, lots of shared facts ⇒ go relational.

At a glance

Flat-file vs relational
FeatureFlat fileRelational
TablesOne big tableSeveral linked tables
How facts are storedRepeated on every relevant rowEach fact stored once
Data redundancyHighLow
Risk of inconsistencyHigh (update anomalies)Low
Links between tablesNonePrimary & foreign keys
Best forSmall, simple dataLarge, related data