Imagine the school office keeps a giant table of every pupil — hundreds of rows, one per pupil, with columns for their forename, surname, year group, house and date of birth. Now the head of Year 10 asks: "Give me the surnames of everyone in Year 10, in alphabetical order." You could scroll through all those rows by hand… or you could just ask the database. The language you ask in is called SQL — Structured Query Language — and it's how almost every database in the world is questioned, from your school's system to the servers behind your favourite app.
An instruction written in SQL is called a query. The single most important
query — the one you'll use far more than any other — is the SELECT
statement. SELECT doesn't change anything in the database; it just retrieves
data and hands you back a fresh table of results.
Everything below queries this one table, called pupil. Each row is
one pupil (a record) and each column is one piece of information (a
field).
Keep this table in mind — as we build up each query, picture which rows and which columns it would hand back.
A SELECT query is read almost like an English sentence. It has up to four parts, and
they always come in this order:
Only the first two lines are compulsory: you must say what you want and
where to get it. The WHERE and ORDER BY parts are
optional extras that let you filter and sort. SQL keywords are usually written in
CAPITALS so they stand out, and each statement ends with a semicolon
(;).
* or a named list
Right after SELECT you say which columns you want. The star * is
shorthand for "every column":
This returns the whole table — all columns, all rows:
Usually you don't want everything. List just the columns you need, separated by commas, and the result only contains those:
Same rows, fewer columns. Naming your columns keeps results tidy and is considered better style
than reaching for * every time.
WHERE
WHERE is a test applied to every row. A row is kept only if the test is
true for it. For example, "only pupils in Year 10":
Notice 10 is a number, so it needs no quotes. But text values do
need quote marks around them:
You can compare with any of these operators:
= equal to <> not equal to> greater than < less than
>= at least <= at mostLIKE — matches a text pattern, where % stands for
"any characters"For example, "everyone born in or before 2009" and "every surname starting with P":
AND / OR
You can join several conditions. AND keeps a row only when
both tests are true; OR keeps it when either is
true.
"Year 10 pupils who are in Kestrel" — a row must satisfy both parts.
"Anyone in Kestrel or Falcon" — a row is kept if either part is true.
ORDER BY
A database has no promised order, so if you want the results sorted you must say so.
ORDER BY names the column to sort on; add ASC for ascending
(A→Z, small→large — the default) or DESC for descending (Z→A, large→small).
That answers the head of Year 10's original question: the Year 10 pupils, surnames sorted
alphabetically. Swap in ORDER BY dob DESC and you'd get the same pupils listed
youngest first instead.
The database does the work of a SELECT for you, but it helps to see what's happening
underneath. A table is really just a list of records, and a query is filter (the
WHERE), then sort (the ORDER BY), then pick
the columns (the SELECT). Here's that exact query written in TypeScript over
an array of pupil objects — press Run and compare it with the SQL above:
Change the year in the filter, or sort on a different field, and you're doing exactly
what a different SELECT query would do.
SQL is a declarative language: you describe the result you want, and the database figures out the fastest way to fetch it. You never write the loops that scan the rows — there's no "for each pupil, check the year…" in SQL. That's the opposite of the TypeScript above, where you spell out every step. Because the database is free to choose how, it can use clever tricks (like indexes) to answer a query over millions of rows in a blink — something your hand-written loop couldn't match.
SQL grew out of research at IBM in the 1970s, based on ideas by Edgar F. Codd, who invented the
relational model — the idea of storing data in linked tables. It was standardised in
1986 and has been the common language of databases ever since. Learn the SELECT
statement once and you can query almost any database you'll ever meet.
Two classic mix-ups to avoid:
WHERE filters ROWS, not columns. It decides which
records come back; it has nothing to do with which fields you see. The columns are chosen
by the list right after SELECT. So
SELECT surname FROM pupil WHERE year = 10 gives you the surname column, but only
the rows where the year is 10.
WHERE surname = 'Patel', not
WHERE surname = Patel. Without the quotes the database thinks Patel
is the name of a column and your query fails. Numbers, like
WHERE year = 10, are the exception — they take no quotes.