Measurement and Uncertainty

Open any textbook and you will read that the acceleration due to gravity at Earth's surface is g = 9.81\ \text{m/s}^2. Clean, confident, memorable. But that tidy number hides a story. Take a metre-long pendulum into the lab, time it swinging back and forth, and work g out for yourself, and you will not get 9.81. You will get 9.7, then 9.9, then 9.83 — a fresh answer every time you repeat the timing. Your reaction on the stopwatch, a draught in the room, the exact height you released from: each nudges the result a little.

This is not sloppiness, and it does not go away with a better stopwatch. It is the central fact of experimental science, worth stating in bold: no measurement is ever exact. Every number you read off an instrument is really a range — a best guess with a cloud of doubt around it. The job of a physicist is not to pretend the cloud isn't there; it is to measure the cloud, quote it honestly, and carry it through the calculation. That honest cloud is called the uncertainty, and learning to handle it is what separates a number from a result.

Two very different ways to be wrong

Suppose you time your pendulum ten times. Two things can spoil the answer, and they are worlds apart.

Random error scatters your readings above and below the truth, unpredictably. Your thumb on the stopwatch is a hair early one swing, a hair late the next; a tiny air current speeds one swing and slows another. Repeat the measurement and the wobble jumps around at random. The beautiful thing about random error is that it averages out: take the mean of many readings and the high ones cancel the low ones, so the average creeps closer and closer to the true value. Random error controls your precision — how tightly your repeated readings agree with one another.

Systematic error is the sneaky one. It pushes every reading the same way by the same amount. A stopwatch that runs 2\% slow, a ruler whose zero mark has worn off the end, a voltmeter that reads high — each biases the whole set of measurements in one direction. And here is the trap: repeating the measurement does nothing. Averaging a hundred readings from a slow stopwatch just gives you a very precise, very confident wrong answer. Systematic error controls your accuracy — how close your result sits to the true value.

The dartboard picture

The cleanest way to see the difference is to throw darts. Let the bullseye be the true value you are trying to measure, and each dart a single measurement. Then precision is how tightly the darts cluster, and accuracy is how centred that cluster is on the bullseye — and the two are completely independent. Reveal the four targets one by one, and press Refresh to throw a fresh handful of darts.

Read the picture from left to right. The top-right board is the dangerous one: the darts agree beautifully with each other, so a naive experimenter would trust the result completely — yet every single dart missed the bullseye the same way. Precise, confident, and wrong. The bottom-left board is messy but honest: no single dart is great, but their average lands on the bullseye. That is exactly why we repeat measurements and take a mean — it beats down random scatter, but it can never fix a systematic miss.

Quoting a result: value, uncertainty, and significant figures

Once you have measured the cloud, you quote a result in the form

\text{result} = (\text{best estimate}) \pm (\text{uncertainty}) \quad \text{[units]},

for example g = (9.78 \pm 0.05)\ \text{m/s}^2. Read aloud, that says: "my best estimate is 9.78, and the true value very probably lies somewhere between 9.73 and 9.83." The \pm 0.05 is not decoration — it is the honesty.

Two conventions keep the quoting sensible:

This is what significant figures are really about. The number of significant figures you write is a claim about how well you know a value. Writing 9.78421 for g claims you know it to a part in a million; you don't, so you must not write it. The digits you keep are the ones you can stand behind.

Worked example — quoting to the right figures. A raw calculation spits out a resistance of R = 47.3826\ \Omega with an uncertainty of 1.7\ \Omega. First round the uncertainty to one significant figure: \pm 2\ \Omega. That uncertainty lives in the "ones" place, so round the value to the ones place too:

R = (47 \pm 2)\ \Omega.

All those trailing digits — .3826 — were noise dressed up as knowledge. The honest result has just two significant figures.

Absolute, fractional, and percentage uncertainty

The same doubt can be dressed three ways, and each is useful in its place. Suppose you measure a length L = 2.50\ \text{m} with an uncertainty of 0.05\ \text{m}.

Fractional and percentage uncertainties are the ones that tell you whether a measurement is any good. A \pm 1\ \text{mm} error is superb on a table but hopeless on a transistor. Only "compared with what?" — the fractional uncertainty — settles it.

Worked example — converting back and forth. A voltmeter reads V = 12.0\ \text{V} with a 3\% uncertainty. What is the absolute uncertainty? Turn the percentage into a fraction and multiply by the value:

\Delta V = \frac{3}{100}\times 12.0\ \text{V} = 0.36\ \text{V} \approx 0.4\ \text{V},

so V = (12.0 \pm 0.4)\ \text{V}. Going the other way, a current I = (2.00 \pm 0.08)\ \text{A} has fractional uncertainty 0.08/2.00 = 0.04, i.e. 4\%.

A first taste of combining uncertainties

Measurements rarely stand alone — you multiply a length by a width to get an area, or subtract two times to get an interval. The doubt has to be carried through. The full rules of uncertainty propagation get their own page; here are just the two you will use constantly, stated as rules of thumb.

Worked example — area of a rectangle. You measure a plate as L = 20.0\ \text{cm} with 2\% uncertainty and W = 10.0\ \text{cm} with 3\% uncertainty. The area A = L\times W = 200\ \text{cm}^2. Since area is a product, we combine the percentages. A quick estimate just adds them:

\frac{\Delta A}{A} \approx 2\% + 3\% = 5\% \quad\Longrightarrow\quad \Delta A \approx 0.05 \times 200 = 10\ \text{cm}^2,

giving A = (200 \pm 10)\ \text{cm}^2. (The stricter quadrature rule gives \sqrt{2^2+3^2} \approx 3.6\%, i.e. \pm 7\ \text{cm}^2 — the simple sum is a safe over-estimate.) Notice the golden rule at work: you add percentages for a product, not the raw centimetres.

Worked example — a difference in quadrature. A trolley passes two gates at times t_1 = (1.20 \pm 0.05)\ \text{s} and t_2 = (2.60 \pm 0.05)\ \text{s}. The interval is \Delta t = t_2 - t_1 = 1.40\ \text{s}, and because it is a difference we add the absolute uncertainties in quadrature:

\Delta(\Delta t) = \sqrt{0.05^2 + 0.05^2} = 0.05\sqrt{2} \approx 0.07\ \text{s},

so the interval is (1.40 \pm 0.07)\ \text{s}.

Watch the cloud change shape

Imagine taking the same reading over and over and tallying how often each value comes up. The results pile into a bell-shaped hump — the random error scatter — centred on wherever your experiment actually points. Drag Spread to change the random error (a narrow hump is precise, a wide hump is imprecise), and drag Bias to introduce a systematic error that slides the whole hump off the true value (the thin spike). No amount of narrowing the hump — no amount of averaging — will drag it back onto the true value once a bias is present; only fixing the instrument does that.

Precision is the width of the hump; accuracy is where its peak sits relative to the true value. A good measurement is both narrow and centred.

Watch out! This is the most common mistake in every first-year lab, and it is completely backwards. Writing more digits does not make a number more accurate — it just makes a bigger claim about how well you know it, a claim you usually cannot back up. A cheap ruler that reports 3.7\ \text{cm} is being honest; the same ruler reporting 3.71428\ \text{cm} is lying, because it cannot possibly resolve millionths of a centimetre.

Accuracy is set by how close you are to the truth (kill the systematic errors), and precision by how tightly your readings agree (kill the random errors). Neither has anything to do with how many digits your calculator happens to display. When you divide 10.0 by 3 and the screen shows 3.33333333, all but two or three of those threes are fantasy. Report the digits you can defend, and no more — that is what significant figures are for.

Only half of it does — and knowing which half is the whole game. Averaging N readings shrinks the random uncertainty: the standard error of the mean falls like 1/\sqrt{N}, so four times as many readings halves the random scatter. That is real, and it is why we repeat measurements.

But a systematic error is untouched by repetition. If your stopwatch runs 2\% slow, then every one of your thousand readings is 2\% low, and their average is 2\% low too — now quoted with a gloriously tiny random uncertainty that makes your wrong answer look authoritative. Precision without accuracy is a trap. The only cures for systematic error are detective work and calibration: check your instrument against a known standard, re-zero it, swap it for another, and hunt down the hidden bias. No amount of patience at the stopwatch will do it for you.