Entity-Relationship Modelling

Suppose your school asks you to build a database for the library: it must track every book, every member who can borrow one, and every loan that ties the two together. Where do you even start? The temptation is to open a spreadsheet and start typing columns — but that is how you end up, three weeks later, with the same author's name spelt four different ways and no clean way to ask "who has this book?".

Professionals do something else first: before building a single table, they model the data. They sketch out the things the system stores data about, the facts held about each thing, and how those things are connected. That sketch is an entity-relationship (ER) model, and it is the blueprint the tables are built from. Get the model right and the tables and keys almost design themselves; get it wrong and you pay for it with a painful redesign later. ER modelling is a staple of A-level Computer Science for exactly this reason.

Three ingredients: entities, attributes, relationships

An ER model is built from just three kinds of thing. Nail these three words and the rest is detail:

Here is a first, tiny ER model. Two entities, each with a few attributes, joined by one relationship. Step through it:

Read it as a sentence: "A Student enrols on a Course." The boxes are entities, the little list inside each is its attributes, and the line between them is the relationship. Notice we have not yet said how many students or courses are involved — and that is the most important question of all.

Cardinality: how many of each?

The single most important property of a relationship is its cardinality — how many instances of one entity can be linked to how many of the other. There are exactly three cases, and choosing the right one drives how the tables and foreign keys are laid out.

On an ER diagram the cardinality is drawn on the line. A-level uses two common notations: a plain 1 or N written near each end, or the crow's foot — a single bar for "one" and a three-pronged fork for "many". Use the switch to flip between the three cases and watch the ends of the line change:

The crow's foot always "opens" towards the many side — a handy memory aid: many feet, many things. Reading the middle picture: the bar sits at the Teacher end (one teacher) and the fork at the Student end (many students), so it reads "one Teacher to many Students".

Think about where the linking value has to live. In "one Teacher has many Students", each student belongs to exactly one teacher — so you can store that one TeacherID as a foreign key inside each student's record. One value, one column, done. You put the foreign key on the "many" side (Student), pointing at the primary key on the "one" side (Teacher). That is why 1:N is so comfortable: a single field captures the whole relationship. A 1:1 works the same way (the foreign key can go on either side). It is many-to-many that breaks this trick — which is the next card.

Resolving a many-to-many relationship

A many-to-many relationship cannot be stored directly. Try it: a Student takes many Courses, so you cannot put a single CourseID foreign key in the Student record — a student has several. And you cannot put a single StudentID in the Course record either — a course has many students. A single foreign key can only ever hold one value, so neither table can hold the relationship on its own.

The fix is a classic piece of design. We introduce a third table — a linking table (also called a junction, bridge or associative table). Each of its records represents one link — one student on one course — and it holds two foreign keys: one pointing at Student, one pointing at Course. A student taking three courses simply produces three rows in the linking table.

The magic is that the linking table turns one M:N relationship into two 1:N relationships — one Student has many Enrolments, and one Course has many Enrolments — and 1:N is exactly the shape a foreign key loves. The tables below show it working: to find everyone on course C2, scan the Enrolment table for C2 and follow each StudentID back to the Students table.

Students
StudentID PKName
S1Ava
S2Ben
S3Cara
Enrolment — the linking table: two foreign keys
StudentID FKCourseID FK
S1C1
S1C2
S2C2
S3C1
Courses
CourseID PKTitle
C1Maths
C2History

Look at Ava: she appears in two rows of Enrolment (courses C1 and C2), and course C2 appears in two rows (Ava and Ben). The many-to-many is captured perfectly — and every individual link is one tidy row.

The classic exam mistake is to try to store a many-to-many relationship directly — sticking a single CourseID foreign key in the Students table (or a single StudentID in the Courses table) and hoping it stretches. It cannot: a foreign key holds exactly one value, so it can only ever record one of the many links. You must resolve every M:N relationship by introducing a linking (junction) table that holds the two foreign keys — one row per connection.

And do the counting up front, while it is still a pencil sketch. Deciding cardinality on the ER diagram — before you create any tables — is cheap. Discovering after you have loaded ten thousand records that a relationship was really M:N all along, and now needs a whole extra table plus rewired keys and rewritten queries, is a painful and error-prone redesign. Getting the cardinality right on the model is the whole point of modelling first.

Modelling in practice: a quick recipe

Given a description of a system, you can build its ER model with four steps:

Try it on this: "A library has members and books. A member can borrow many books over time, and any book can be borrowed by many members over time." The nouns give entities Member and Book; "borrow" is the relationship; and because it is many-to-many, you introduce a Loan linking table holding a MemberID and a BookID (plus useful extras like the borrow date). That is a complete, well-formed model — and it is exactly the shape a real library system uses.