Reasoning under uncertainty
A logical agent is a beautiful thing when the world is tidy: feed it facts it knows for certain, and
it grinds out new certainties. But step outside the textbook and the real world stops cooperating. A
medical agent hears a patient say "my tooth hurts". Does the rule Toothache ⇒ Cavity hold?
Not always — the ache could be gum disease, an abscess, or a cracked filling. Patch the rule with
every exception and it grows monstrous and still leaks. The sensors are noisy, the world is only
half-seen, and the rules are never quite complete.
This is the wall that pure logic hits: it is brittle under uncertainty. A statement
is true or false, full stop — there is no room for "probably", "usually", or "I'd
bet three-to-one on it". Yet an agent driving a car, diagnosing an illness, or filtering your spam
must act right now, on incomplete and unreliable evidence. It cannot wait for certainty
that will never come.
The fix is to stop asking "is H true?" and start asking "how strongly
should I believe H, given what I've seen?". We measure that
degree of belief with a number between 0 and 1 — a probability —
and this page is about the single rule that tells an agent how to update those beliefs as evidence
arrives: Bayes' rule, the engine of rational belief.
The language of belief
We describe an uncertain world with random variables — quantities whose value we're
unsure of. \textit{Disease} might be true or false;
\textit{Test} might read positive or negative. To each
possibility we attach a probability. Three flavours of probability do all the work:
- a prior P(H) — your belief in a hypothesis
before looking at any evidence (the disease's base rate in the population);
- a conditional (or likelihood) P(E \mid H) —
how probable the evidence E is if
H holds (how often a sick patient tests positive);
- a joint P(H, E) — the probability that
H and E are both true.
These aren't independent facts: the joint ties the others together through the
product rule, which just says the chance of two things is the chance of the first
times the chance of the second given the first:
P(A, B) = P(A \mid B)\,P(B) = P(B \mid A)\,P(A).
Read either way round, it's the same joint P(A, B). Setting those two
expressions equal is the entire trick — and out of it falls the most important formula an uncertain
agent owns.
Bayes' rule: prior × likelihood → posterior
Because P(H \mid E)\,P(E) = P(E \mid H)\,P(H) (both equal the joint
P(H, E)), divide through by P(E) and you get
the rule the Reverend Thomas Bayes is remembered for:
- The updated belief in a hypothesis after seeing evidence is
P(H \mid E) = \frac{P(E \mid H)\,P(H)}{P(E)}.
- In words: posterior =
likelihood \times prior, divided by
the evidence's overall probability.
- The denominator is a normalising constant that expands, when
H is true-or-false, into
P(E) = P(E \mid H)\,P(H) + P(E \mid \lnot H)\,P(\lnot H).
That's the whole engine. The prior P(H) is what you believed before; the
likelihood P(E \mid H) is how well the hypothesis predicts the
evidence; multiply them, normalise so all the posteriors sum to 1, and you have
P(H \mid E) — what you should believe now. Every time fresh
evidence lands, today's posterior becomes tomorrow's prior, and the agent's beliefs march forward.
Worked example: a rare disease and a "good" test
Here is the classic that trips up doctors and patients alike. A disease affects
1% of people — so the prior is P(D) = 0.01. There's a
test that is 90% sensitive (it catches 90% of true cases:
P(+ \mid D) = 0.90) and 90% specific (it correctly
clears 90% of healthy people, so its false-positive rate is
P(+ \mid \lnot D) = 0.10). You test positive. What's the chance you
actually have the disease? Most people blurt "90%". Let's turn the crank.
First, the probability of a positive result at all — the normaliser:
P(+) = \underbrace{0.90 \times 0.01}_{\text{true positives}} + \underbrace{0.10 \times 0.99}_{\text{false positives}} = 0.009 + 0.099 = 0.108.
Now Bayes' rule:
P(D \mid +) = \frac{P(+ \mid D)\,P(D)}{P(+)} = \frac{0.90 \times 0.01}{0.108} = \frac{0.009}{0.108} \approx 0.083.
Just 8.3% — not 90%. Even after a positive result on a good test, you very probably
don't have the disease. Why? Because the disease is rare, the healthy 99% vastly outnumber
the sick 1%, and 10% of that huge healthy crowd (the false positives, 0.099) swamps the tiny sliver
of genuine positives (0.009). The base rate dominates. This is not a quirk of these
numbers; it is the reason a single positive screening test for a rare condition is almost never cause
for panic on its own.
Run the same calculation yourself — change the numbers and watch the posterior swing:
// Bayes-update calculator for a binary medical test.
function diagnose(prior: number, sensitivity: number, specificity: number) {
const falsePos = 1 - specificity; // P(+ | not D)
// P(+) = P(+|D)P(D) + P(+|not D)P(not D)
const pPos = sensitivity * prior + falsePos * (1 - prior);
const pNeg = 1 - pPos;
const postGivenPos = (sensitivity * prior) / pPos; // P(D | +)
const postGivenNeg = ((1 - sensitivity) * prior) / pNeg; // P(D | -)
return { pPos, postGivenPos, postGivenNeg };
}
const pct = (x: number) => (100 * x).toFixed(1) + "%";
const r = diagnose(0.01, 0.90, 0.90); // prior 1%, sensitivity 90%, specificity 90%
console.log("P(positive) =", pct(r.pPos));
console.log("P(disease | positive) =", pct(r.postGivenPos));
console.log("P(disease | negative) =", pct(r.postGivenNeg));
Notice the last line: a negative result drives your belief down to a fraction of a percent.
The test is far better at reassuring the healthy than at condemning the sick — precisely because the
prior was so low to begin with.
The base rate rules everything
Fix the test at 90% sensitivity and 90% specificity, and slide the base rate (the
prior) from rare to common. The posterior P(D \mid +) starts near zero
and only climbs toward certainty once the disease is genuinely widespread. Drag the sliders to change
how good the test is and see how much the curve lifts.
Two lessons jump out of the curve. First, when the prior is tiny the posterior stays stubbornly low
no matter how good the test — you have to make the test extraordinarily specific to rescue a
positive result on a rare condition. Second, this is exactly why doctors confirm: a
positive screen is a reason to run a second, independent test, not a diagnosis.
Two closely related mistakes lurk here, and together they cause real harm.
Base-rate neglect. The seductive error is to read a "90% accurate" test as "90%
chance you're sick". That ignores the prior entirely. As we saw, a 90/90 test on a 1%-prior disease
gives a posterior of only ~8%. The base rate is not a footnote — for a rare event it is the
dominant term. Always ask "how common is this to begin with?" before trusting a positive.
The prosecutor's fallacy. This is the deeper confusion underneath it:
P(H \mid E) \neq P(E \mid H). "The test rarely fires on healthy people"
(P(+ \mid \lnot D) small) is not the same claim as "a positive
means you're almost certainly sick" (P(D \mid +) large). Swapping the
two — telling a jury that a one-in-a-million lab match means one-in-a-million odds of innocence —
has put innocent people in prison. Bayes' rule exists precisely to convert one into the other, and
the base rate is the exchange rate.
Probability doesn't throw logic away — it contains it. A probability of
1 is exactly "certainly true" and 0 is
"certainly false"; the product and Bayes rules then behave like ordinary logical deduction. Feed an
agent nothing but 1s and 0s and it reasons like a classical logician. Feed it the honest 0.7s and
0.03s of the real world and the same machinery keeps working, now handling doubt
gracefully. Probability theory is logic generalised to a world where you're rarely completely sure.
Suppose that after your 8.3% positive you take a second, independent test that also comes
back positive. Bayes chains: your posterior 0.083 becomes the new prior, and you turn the crank
again with the same likelihoods. The maths gives roughly
\frac{0.9 \times 0.083}{0.9 \times 0.083 + 0.1 \times 0.917} \approx 0.45
— from 8% up to about 45% after just one more positive. Independent evidence multiplies,
and belief can move fast once corroboration arrives. This assumes the two tests are
conditionally independent given the disease — that they don't fail for the same
hidden reason.
Making it tractable, and where the numbers come from
A full joint distribution over many variables is astronomically large — with
n true/false variables it has 2^n entries. What
rescues us is independence: if two variables don't influence each other,
P(A, B) = P(A)\,P(B), and the table factors into small pieces. Even better
is conditional independence — two symptoms that are independent once you know
the disease — which lets an agent store only a handful of local probabilities instead of the
whole monster. That factoring is the idea behind Bayesian networks, the workhorse
of probabilistic AI.
One question remains: where do all these priors and likelihoods actually come from? Sometimes from an
expert's judgement, but far more often we learn them from data — counting how often
the disease and the positive tests co-occur across thousands of records, and letting those frequencies
become our probabilities. That is the bridge from reasoning under uncertainty to
machine learning:
Bayes' rule tells the agent how to use probabilities, and learning tells it how to
get them.