Reasoning under uncertainty

A logical agent is a beautiful thing when the world is tidy: feed it facts it knows for certain, and it grinds out new certainties. But step outside the textbook and the real world stops cooperating. A medical agent hears a patient say "my tooth hurts". Does the rule Toothache ⇒ Cavity hold? Not always — the ache could be gum disease, an abscess, or a cracked filling. Patch the rule with every exception and it grows monstrous and still leaks. The sensors are noisy, the world is only half-seen, and the rules are never quite complete.

This is the wall that pure logic hits: it is brittle under uncertainty. A statement is true or false, full stop — there is no room for "probably", "usually", or "I'd bet three-to-one on it". Yet an agent driving a car, diagnosing an illness, or filtering your spam must act right now, on incomplete and unreliable evidence. It cannot wait for certainty that will never come.

The fix is to stop asking "is H true?" and start asking "how strongly should I believe H, given what I've seen?". We measure that degree of belief with a number between 0 and 1 — a probability — and this page is about the single rule that tells an agent how to update those beliefs as evidence arrives: Bayes' rule, the engine of rational belief.

The language of belief

We describe an uncertain world with random variables — quantities whose value we're unsure of. \textit{Disease} might be true or false; \textit{Test} might read positive or negative. To each possibility we attach a probability. Three flavours of probability do all the work:

These aren't independent facts: the joint ties the others together through the product rule, which just says the chance of two things is the chance of the first times the chance of the second given the first:

P(A, B) = P(A \mid B)\,P(B) = P(B \mid A)\,P(A).

Read either way round, it's the same joint P(A, B). Setting those two expressions equal is the entire trick — and out of it falls the most important formula an uncertain agent owns.

Bayes' rule: prior × likelihood → posterior

Because P(H \mid E)\,P(E) = P(E \mid H)\,P(H) (both equal the joint P(H, E)), divide through by P(E) and you get the rule the Reverend Thomas Bayes is remembered for:

That's the whole engine. The prior P(H) is what you believed before; the likelihood P(E \mid H) is how well the hypothesis predicts the evidence; multiply them, normalise so all the posteriors sum to 1, and you have P(H \mid E) — what you should believe now. Every time fresh evidence lands, today's posterior becomes tomorrow's prior, and the agent's beliefs march forward.

Worked example: a rare disease and a "good" test

Here is the classic that trips up doctors and patients alike. A disease affects 1% of people — so the prior is P(D) = 0.01. There's a test that is 90% sensitive (it catches 90% of true cases: P(+ \mid D) = 0.90) and 90% specific (it correctly clears 90% of healthy people, so its false-positive rate is P(+ \mid \lnot D) = 0.10). You test positive. What's the chance you actually have the disease? Most people blurt "90%". Let's turn the crank.

First, the probability of a positive result at all — the normaliser:

P(+) = \underbrace{0.90 \times 0.01}_{\text{true positives}} + \underbrace{0.10 \times 0.99}_{\text{false positives}} = 0.009 + 0.099 = 0.108.

Now Bayes' rule:

P(D \mid +) = \frac{P(+ \mid D)\,P(D)}{P(+)} = \frac{0.90 \times 0.01}{0.108} = \frac{0.009}{0.108} \approx 0.083.

Just 8.3% — not 90%. Even after a positive result on a good test, you very probably don't have the disease. Why? Because the disease is rare, the healthy 99% vastly outnumber the sick 1%, and 10% of that huge healthy crowd (the false positives, 0.099) swamps the tiny sliver of genuine positives (0.009). The base rate dominates. This is not a quirk of these numbers; it is the reason a single positive screening test for a rare condition is almost never cause for panic on its own.

Run the same calculation yourself — change the numbers and watch the posterior swing:

// Bayes-update calculator for a binary medical test. function diagnose(prior: number, sensitivity: number, specificity: number) { const falsePos = 1 - specificity; // P(+ | not D) // P(+) = P(+|D)P(D) + P(+|not D)P(not D) const pPos = sensitivity * prior + falsePos * (1 - prior); const pNeg = 1 - pPos; const postGivenPos = (sensitivity * prior) / pPos; // P(D | +) const postGivenNeg = ((1 - sensitivity) * prior) / pNeg; // P(D | -) return { pPos, postGivenPos, postGivenNeg }; } const pct = (x: number) => (100 * x).toFixed(1) + "%"; const r = diagnose(0.01, 0.90, 0.90); // prior 1%, sensitivity 90%, specificity 90% console.log("P(positive) =", pct(r.pPos)); console.log("P(disease | positive) =", pct(r.postGivenPos)); console.log("P(disease | negative) =", pct(r.postGivenNeg));

Notice the last line: a negative result drives your belief down to a fraction of a percent. The test is far better at reassuring the healthy than at condemning the sick — precisely because the prior was so low to begin with.

The base rate rules everything

Fix the test at 90% sensitivity and 90% specificity, and slide the base rate (the prior) from rare to common. The posterior P(D \mid +) starts near zero and only climbs toward certainty once the disease is genuinely widespread. Drag the sliders to change how good the test is and see how much the curve lifts.

Two lessons jump out of the curve. First, when the prior is tiny the posterior stays stubbornly low no matter how good the test — you have to make the test extraordinarily specific to rescue a positive result on a rare condition. Second, this is exactly why doctors confirm: a positive screen is a reason to run a second, independent test, not a diagnosis.

Two closely related mistakes lurk here, and together they cause real harm.

Base-rate neglect. The seductive error is to read a "90% accurate" test as "90% chance you're sick". That ignores the prior entirely. As we saw, a 90/90 test on a 1%-prior disease gives a posterior of only ~8%. The base rate is not a footnote — for a rare event it is the dominant term. Always ask "how common is this to begin with?" before trusting a positive.

The prosecutor's fallacy. This is the deeper confusion underneath it: P(H \mid E) \neq P(E \mid H). "The test rarely fires on healthy people" (P(+ \mid \lnot D) small) is not the same claim as "a positive means you're almost certainly sick" (P(D \mid +) large). Swapping the two — telling a jury that a one-in-a-million lab match means one-in-a-million odds of innocence — has put innocent people in prison. Bayes' rule exists precisely to convert one into the other, and the base rate is the exchange rate.

Probability doesn't throw logic away — it contains it. A probability of 1 is exactly "certainly true" and 0 is "certainly false"; the product and Bayes rules then behave like ordinary logical deduction. Feed an agent nothing but 1s and 0s and it reasons like a classical logician. Feed it the honest 0.7s and 0.03s of the real world and the same machinery keeps working, now handling doubt gracefully. Probability theory is logic generalised to a world where you're rarely completely sure.

Suppose that after your 8.3% positive you take a second, independent test that also comes back positive. Bayes chains: your posterior 0.083 becomes the new prior, and you turn the crank again with the same likelihoods. The maths gives roughly \frac{0.9 \times 0.083}{0.9 \times 0.083 + 0.1 \times 0.917} \approx 0.45 — from 8% up to about 45% after just one more positive. Independent evidence multiplies, and belief can move fast once corroboration arrives. This assumes the two tests are conditionally independent given the disease — that they don't fail for the same hidden reason.

Making it tractable, and where the numbers come from

A full joint distribution over many variables is astronomically large — with n true/false variables it has 2^n entries. What rescues us is independence: if two variables don't influence each other, P(A, B) = P(A)\,P(B), and the table factors into small pieces. Even better is conditional independence — two symptoms that are independent once you know the disease — which lets an agent store only a handful of local probabilities instead of the whole monster. That factoring is the idea behind Bayesian networks, the workhorse of probabilistic AI.

One question remains: where do all these priors and likelihoods actually come from? Sometimes from an expert's judgement, but far more often we learn them from data — counting how often the disease and the positive tests co-occur across thousands of records, and letting those frequencies become our probabilities. That is the bridge from reasoning under uncertainty to machine learning: Bayes' rule tells the agent how to use probabilities, and learning tells it how to get them.