Bayes' Theorem

Bayes' theorem is the rule for updating a belief when new evidence arrives. It rewrites a conditional probability by swapping what is given for what is unknown:

P(H \mid D) = \frac{P(D \mid H)\,P(H)}{P(D)}.

Read it as a flow of belief about a hypothesis H after seeing data D:

P(H) — the prior: what you believed before.
P(D \mid H) — the likelihood: how well H predicts the data.
P(H \mid D) — the posterior: your updated belief.
P(D) — the evidence: a normalizing constant.

In words: posterior ∝ likelihood × prior. This one proportionality is the backbone of all the Bayesian reasoning ahead.

The base-rate surprise

Expanding the evidence with the law of total probability, for a yes/no hypothesis,

P(H \mid D) = \frac{P(D\mid H)\,P(H)}{P(D\mid H)\,P(H) + P(D\mid \lnot H)\,P(\lnot H)}.

The famous lesson: a very accurate test for a very rare condition still gives mostly false alarms, because the tiny prior drags the posterior down. The prior is not optional — it is half the answer.

How the prior steers the posterior

The curve is the posterior P(H\mid D) as a function of the prior P(H), for a test with sensitivity P(D\mid H) and false-positive rate P(D\mid\lnot H). The faint diagonal is "no update". Set the test's accuracy and read off the surprise: when the prior is small (left edge), even a strong positive leaves the posterior low.

P(H\mid D) = \dfrac{P(D\mid H)\,P(H)}{P(D)} — posterior ∝ likelihood × prior.
The evidence P(D) = \sum_H P(D\mid H)P(H) just normalizes the posterior to sum to 1.
A rare hypothesis (small prior) stays improbable unless the likelihood is overwhelming.