The Binomial Distribution
Flip a coin ten times and count the heads. Inspect twenty light bulbs and count the faulty
ones. Ask fifty people a yes/no question and count the "yes"es. In every case you repeat the
same little experiment a fixed number of times and count how
often one particular result — call it a success — happens. The
binomial distribution is the
distribution
of that count.
We write
X \sim B(n, p)
to say that X — the number of successes — follows a binomial
distribution with n trials, each having success probability
p. The symbol \sim is read
"is distributed as".
A single coin flip is the cleanest example of a binomial trial. It has exactly two
outcomes (heads or tails), the probability never changes from flip to flip, and one flip tells
you nothing about the next. Call "heads" a success with p = 0.5, flip
n times, and the number of heads is
B(n, 0.5). Almost every binomial story is really a coin in disguise —
a coin that may be bent so that p \ne 0.5.
The four conditions
A count is binomial only when all four of these hold. They are worth learning by heart, because
spotting a broken one is how you decide whether the model applies:
- Fixed number of trials. You decide n in advance —
ten flips, twenty bulbs. You do not keep going until something happens.
- Two outcomes. Each trial is a success or a
failure, nothing in between.
- Constant probability. The success probability p is
the same on every trial.
- Independent trials. The
outcome of one trial
does not affect any other.
A quick memory hook: fixed n,
two outcomes, same p,
independent.
A packing plant knows that 3\% of its apples are bruised. An inspector
pulls a box of 20 apples at random and counts the bruised ones. Each
apple is a trial (bruised = success, sadly), the count is fixed at n = 20,
the rate is a constant p = 0.03, and one apple's state does not change
the next. So the number of bruised apples is B(20, 0.03) — and the
plant can work out how often a box will contain 0,
1, or more duds.
Building the formula
Where does the probability of exactly r successes come from? Build it in
two pieces.
One particular sequence. Suppose n = 5 and we want
exactly r = 2 successes in the specific order
S S F F F. Because the trials are independent we multiply their probabilities:
p \cdot p \cdot (1-p)\cdot(1-p)\cdot(1-p) = p^{2}(1-p)^{3}.
Any other order with two successes — say F S F S F — has exactly the
same probability p^{2}(1-p)^{3}, because multiplication does not care
about order. So every arrangement of two S's and three F's is equally likely.
How many arrangements? The number of ways to place
r successes among n trials is the
binomial coefficient
\binom{n}{r} ("n choose
r"). Add up the equally-likely arrangements and you get the
binomial probability formula:
P(X = r) = \binom{n}{r}\, p^{\,r}\,(1-p)^{\,n-r}, \qquad r = 0, 1, 2, \ldots, n.
Read it left to right: \binom{n}{r} counts the arrangements,
p^{r} is the chance the r successes all happen,
and (1-p)^{n-r} is the chance the remaining
n-r trials all fail.
Worked examples
Three heads in ten flips. A fair coin, n = 10,
p = 0.5. The chance of exactly 3 heads is
P(X = 3) = \binom{10}{3}(0.5)^{3}(0.5)^{7} = 120 \cdot (0.5)^{10} \approx 0.117.
Two bruised apples in a box. With B(20, 0.03),
P(X = 2) = \binom{20}{2}(0.03)^{2}(0.97)^{18} \approx 190 \cdot 0.0009 \cdot 0.578 \approx 0.099.
A small hand check. With n = 5 and
p = 0.3, the chance of exactly 2 successes is
P(X = 2) = \binom{5}{2}(0.3)^{2}(0.7)^{3} = 10 \cdot 0.09 \cdot 0.343 = 0.3087.
See it: the shape of the distribution
The whole distribution is a bar chart: one bar for each possible count
r = 0, 1, \ldots, n, whose height is
P(X = r). Because X lands on
some value, the bar heights add up to 1.
Press Refresh for a fresh n and
p. Watch the shape: when p = 0.5 the bars are
symmetric; when p is small the peak slides left (few
successes are likely), and when p is large it slides right. The tallest
bar sits near np, the average number of successes.
- Check all four conditions before reaching for B(n, p).
The two that trip people up: the trials must be independent, and
p must be the same every trial. Drawing cards
without replacing them breaks both — so that count is not binomial.
- Do not forget the \binom{n}{r} in front. The term
p^{r}(1-p)^{n-r} is only one sequence; you must multiply by
the number of arrangements.
- n is fixed in advance. "Flip until the first head" gives a
different (geometric) distribution, not a binomial one.