Suppose your school asks you to build a database for the library: it must track every book, every member who can borrow one, and every loan that ties the two together. Where do you even start? The temptation is to open a spreadsheet and start typing columns — but that is how you end up, three weeks later, with the same author's name spelt four different ways and no clean way to ask "who has this book?".
Professionals do something else first: before building a single table, they
model the data. They sketch out the things the system stores
data about, the facts held about each thing, and how those things are
connected. That sketch is an entity-relationship (ER) model, and
it is the blueprint the tables are built from. Get the model right and the
An ER model is built from just three kinds of thing. Nail these three words and the rest is detail:
Student, a Course, a Book, an Order. Each
entity becomes a table. (One actual student — Ava Khan — is an
instance of the Student entity, and becomes a single record in
that table.) Entities are almost always nouns.
Student has a StudentID, a Name, a
DateOfBirth. Each attribute becomes a field (a column) of that
entity's table.
Student enrols on a Course; a Member
borrows a Book. Relationships are usually described by a
verb, and they are what turn a pile of separate tables into a connected
relational database.
Here is a first, tiny ER model. Two entities, each with a few attributes, joined by one relationship. Step through it:
Read it as a sentence: "A Student enrols on a Course." The boxes are entities, the little list inside each is its attributes, and the line between them is the relationship. Notice we have not yet said how many students or courses are involved — and that is the most important question of all.
The single most important property of a relationship is its cardinality — how
many instances of one entity can be linked to how many of the other. There are exactly three
cases, and choosing the right one drives how the tables and
Country has one CapitalCity, and each capital belongs to
one country. (1:1 relationships are fairly rare — often the two entities could just be merged
into one table.)
Teacher teaches
many Students, but each Student has one form
Teacher. This is the most common kind of relationship.
Student takes many Courses, and each Course is
taken by many Students.
On an ER diagram the cardinality is drawn on the line. A-level uses two common notations: a plain 1 or N written near each end, or the crow's foot — a single bar for "one" and a three-pronged fork for "many". Use the switch to flip between the three cases and watch the ends of the line change:
The crow's foot always "opens" towards the many side — a handy memory aid: many feet, many things. Reading the middle picture: the bar sits at the Teacher end (one teacher) and the fork at the Student end (many students), so it reads "one Teacher to many Students".
Think about where the linking value has to live. In "one Teacher has many
Students", each student belongs to exactly one teacher — so you can store
that one TeacherID as a foreign key inside each student's record.
One value, one column, done. You put the foreign key on the "many" side
(Student), pointing at the primary key on the "one" side (Teacher). That is why 1:N is so
comfortable: a single field captures the whole relationship. A 1:1 works the same way (the
foreign key can go on either side). It is many-to-many that breaks this trick — which
is the next card.
A many-to-many relationship cannot be stored directly. Try it: a Student takes many
Courses, so you cannot put a single CourseID foreign key in the Student
record — a student has several. And you cannot put a single StudentID in the Course
record either — a course has many students. A single foreign key can only ever hold
one value, so neither table can hold the relationship on its own.
The fix is a classic piece of design. We introduce a third table — a
linking table (also called a junction, bridge or
associative table). Each of its records represents one link — one student on one
course — and it holds two foreign keys: one pointing at Student, one
pointing at Course. A student taking three courses simply produces three rows in the
linking table.
The magic is that the linking table turns one M:N relationship into two 1:N
relationships — one Student has many Enrolments, and one
Course has many Enrolments — and 1:N is exactly the shape a foreign key
loves. The tables below show it working: to find everyone on course C2, scan the
Enrolment table for C2 and follow each StudentID back to
the Students table.
| StudentID PK | Name |
|---|---|
| S1 | Ava |
| S2 | Ben |
| S3 | Cara |
| StudentID FK | CourseID FK |
|---|---|
| S1 | C1 |
| S1 | C2 |
| S2 | C2 |
| S3 | C1 |
| CourseID PK | Title |
|---|---|
| C1 | Maths |
| C2 | History |
Look at Ava: she appears in two rows of Enrolment (courses C1 and C2), and
course C2 appears in two rows (Ava and Ben). The many-to-many is captured perfectly — and every
individual link is one tidy row.
The classic exam mistake is to try to store a many-to-many relationship
directly — sticking a single CourseID foreign key in the Students table
(or a single StudentID in the Courses table) and hoping it stretches. It cannot: a
foreign key holds exactly one value, so it can only ever record one of the many links.
You must resolve every M:N relationship by introducing a
linking (junction) table that holds the two foreign keys — one row per
connection.
And do the counting up front, while it is still a pencil sketch. Deciding cardinality on the ER diagram — before you create any tables — is cheap. Discovering after you have loaded ten thousand records that a relationship was really M:N all along, and now needs a whole extra table plus rewired keys and rewritten queries, is a painful and error-prone redesign. Getting the cardinality right on the model is the whole point of modelling first.
Given a description of a system, you can build its ER model with four steps:
Try it on this: "A library has members and books. A member can borrow many
books over time, and any book can be borrowed by many members over time." The nouns give entities
Member and Book; "borrow" is the relationship; and because it is
many-to-many, you introduce a Loan linking table holding a
MemberID and a BookID (plus useful extras like the borrow date). That is
a complete, well-formed model — and it is exactly the shape a real library system uses.