Bayes' Theorem
Bayes' theorem is the rule for updating a belief when new evidence arrives. It
rewrites a conditional probability by swapping what is given for what is unknown:
P(H \mid D) = \frac{P(D \mid H)\,P(H)}{P(D)}.
Read it as a flow of belief about a hypothesis H after seeing data
D:
- P(H) — the prior: what you believed before.
- P(D \mid H) — the likelihood: how well H predicts the data.
- P(H \mid D) — the posterior: your updated belief.
- P(D) — the evidence: a normalizing constant.
In words: posterior ∝ likelihood × prior. This one proportionality is the
backbone of all the Bayesian reasoning ahead.
The base-rate surprise
Expanding the evidence with the law of total probability, for a yes/no hypothesis,
P(H \mid D) = \frac{P(D\mid H)\,P(H)}{P(D\mid H)\,P(H) + P(D\mid \lnot H)\,P(\lnot H)}.
The famous lesson: a very accurate test for a very rare condition still gives mostly false
alarms, because the tiny prior drags the posterior down. The prior is not optional — it is half
the answer.
How the prior steers the posterior
The curve is the posterior P(H\mid D) as a function of the prior
P(H), for a test with sensitivity P(D\mid H)
and false-positive rate P(D\mid\lnot H). The faint diagonal is "no
update". Set the test's accuracy and read off the surprise: when the prior is small (left edge),
even a strong positive leaves the posterior low.
- P(H\mid D) = \dfrac{P(D\mid H)\,P(H)}{P(D)} — posterior ∝ likelihood × prior.
- The evidence P(D) = \sum_H P(D\mid H)P(H) just normalizes the posterior to sum to 1.
- A rare hypothesis (small prior) stays improbable unless the likelihood is overwhelming.