Microstates, Macrostates and Entropy
Drop a single drop of ink into a glass of still water and walk away. Come back an hour later and the
ink has spread evenly through the whole glass — a uniform pale grey. You have never seen the
reverse: a grey glass spontaneously gathering its ink back into one sharp drop. The molecules would be
perfectly happy to do it (the laws governing each collision run equally well forwards and backwards),
and no law of force forbids it. Yet it never happens. Why?
The answer is not a new force. It is counting. There are almost unimaginably more
ways for the ink to be spread out than to be gathered up, so a randomly wandering system spends
essentially all of its time spread out. That single idea — that the macroscopic behaviour of matter is
decided by how many microscopic arrangements look the same to us — is the engine of the whole
of statistical
mechanics. This page builds it from scratch: the distinction between a
macrostate and its many microstates, the count
\Omega of those microstates, and the one short equation that turns a
headcount into a thermodynamic quantity — Boltzmann's entropy,
S = k_B \ln \Omega.
Four coins: the whole idea in miniature
Forget 10^{23} molecules for a moment and flip four coins. Suppose the only
thing you bother to record is how many came up heads — call that number the
macrostate. You do not write down which coins; you only care about the total.
The full record of every individual coin (H, T, T, H, …) is a microstate.
There are 2^4 = 16 equally likely microstates in all. Group them by
macrostate (number of heads) and count:
\begin{array}{c|c|c}
\text{macrostate (heads)} & \text{microstates} & \Omega \\ \hline
0 & \text{TTTT} & 1 \\
1 & \text{HTTT, THTT, TTHT, TTTH} & 4 \\
2 & \text{HHTT, HTHT, HTTH, THHT, THTH, TTHH} & 6 \\
3 & \text{HHHT, HHTH, HTHH, THHH} & 4 \\
4 & \text{HHHH} & 1
\end{array}
The counts 1,4,6,4,1 are exactly the binomial coefficients
\binom{4}{k}, and they add to 16. Now read the
table as physics. Each microstate is equally likely (that is the only assumption we make), so the
probability of a macrostate is just its share of the microstates. The "two heads"
macrostate has 6/16 of them; "all heads" has a lonely
1/16. The balanced, boring, middle macrostate wins simply because there are
more ways to build it. Blow the four coins up to 10^{23} gas molecules
choosing the left or right half of a box, and "lonely" becomes "never in the lifetime of the universe".
Macrostate, microstate, multiplicity
Let us pin the three words down, because getting them straight is nine-tenths of the subject.
-
Microstate. A complete specification of the system at the microscopic
level — every particle's position and velocity (classically), or the full quantum state. It is the
most detailed description possible.
-
Macrostate. A description in terms of a handful of macroscopic variables
— total energy E, volume V, particle number
N, temperature, pressure, or "how many molecules are in the left half".
Enormously many microstates share the same macrostate.
-
Multiplicity \Omega. The number of
microstates compatible with a given macrostate. It is a pure count — a (usually gigantic) positive
integer.
And one assumption ties them together — the load-bearing postulate of the whole theory:
-
For an isolated system in equilibrium, every accessible microstate is equally
probable.
-
Consequently the probability of observing a macrostate is proportional to its multiplicity:
P \propto \Omega. The macrostate we call "equilibrium" is simply the
one with the largest \Omega.
Notice what this does not say. It does not say the system prefers equilibrium, or is pulled
toward it, or minimises anything. It says all microstates are on an equal footing, and equilibrium
wins by sheer weight of numbers. Everything else — the arrow of time, the second law, why heat flows
hot-to-cold — is downstream of that one democratic assumption.
Counting a gas: particles in two halves of a box
Replace the coins with a gas. Imagine a box split, in our imagination only, into a left half and a
right half. Each of the N molecules is, at any instant, in one half or the
other — that is our stand-in for "heads or tails". The macrostate is how many molecules,
n, are in the left half. The number of microstates
realising it is the number of ways to choose which n of the
N molecules sit on the left — the binomial coefficient:
\Omega(n) = \binom{N}{n} = \frac{N!}{n!\,(N-n)!}.
Worked count, N = 4. The all-left macrostate
n=4 has \Omega = \binom{4}{4} = 1 — there is
exactly one way for every molecule to be on the left. The even split
n=2 has \Omega = \binom{4}{2} = \tfrac{4!}{2!\,2!} = 6.
So the balanced macrostate is already 6 times more likely than the
completely-lopsided one, with only four molecules.
Now let N get large. The even split
n = N/2 always carries the biggest \Omega, and
its lead grows explosively. For N = 100 the ratio of the even-split
multiplicity to the all-on-one-side multiplicity is
\binom{100}{50} \approx 10^{29} to 1. For a
real, mole-sized N \approx 6\times10^{23} the odds against finding even a
one-part-in-a-billion excess on one side are so steep that you would wait many times the age of the
universe to see it once. This is why a gas fills its container uniformly: not because it is forced to,
but because "uniform" is where almost all of the microstates live.
The distribution of \Omega(n) across macrostates is a
binomial peak. Slide N up in the graph below (it plots the
multiplicity relative to its maximum, \Omega(n)/\Omega_{\max}, against the
fraction n/N) and watch the bump around the even split grow into a
needle-sharp spike.
The width of that peak scales like \sqrt{N}, but the position
scale is N itself, so the relative fluctuation of the
split away from 50–50 shrinks as
\frac{\Delta n}{N} \sim \frac{1}{\sqrt{N}}.
For N = 10^{23} that relative wobble is about
10^{-11.5} — utterly unmeasurable. The macroscopic world looks sharp and
deterministic precisely because N is so absurdly large.
From a headcount to entropy: Boltzmann's formula
Multiplicities are monstrous numbers — a mole of gas has an \Omega with
something like 10^{24} digits. Worse, they behave badly when you
combine systems. Put system A (with
\Omega_A microstates) beside an independent system
B (with \Omega_B). Each microstate of
A can pair with each of B, so the combined
multiplicity multiplies:
\Omega_{AB} = \Omega_A \,\Omega_B.
But every macroscopic property we know — energy, volume, mass — adds when you put two
systems together; such quantities are called extensive. We would love our new state-variable
to add too. There is exactly one function that turns multiplication into addition — the
logarithm. That is the whole reason for the \ln in
Boltzmann's definition of entropy:
-
The entropy of a macrostate of multiplicity \Omega is
S = k_B \ln \Omega,
where k_B \approx 1.38 \times 10^{-23}\ \text{J/K} is
Boltzmann's constant.
-
It is additive. Because
\ln(\Omega_A \Omega_B) = \ln \Omega_A + \ln \Omega_B, combining systems
gives S_{AB} = S_A + S_B — entropy is extensive, just like energy.
-
Units of J/K. The constant k_B carries the units and
sets the scale that makes this microscopic S agree, to the last digit,
with the entropy Clausius defined from heat and temperature,
dS = \delta Q_{\text{rev}} / T.
The equation S = k_B \ln \Omega is carved on Boltzmann's gravestone in
Vienna (in the form S = k \log W). It is arguably the single most important
sentence in all of thermal physics: it connects the microscopic world of countable arrangements to
the macroscopic world of thermometers and heat engines.
A concrete number. Take a tiny system of
N = 100 two-state particles, all microstates accessible, so
\Omega = 2^{100}. Then
S = k_B \ln\!\left(2^{100}\right) = 100\,k_B \ln 2 \approx 100 \times (1.38\times10^{-23}) \times 0.693
\approx 9.6\times10^{-22}\ \text{J/K}.
Tiny — because 100 particles is tiny. Scale N up
to Avogadro's number and S climbs to the everyday joules-per-kelvin you
meet in a thermodynamics table. Note the clean pattern: for
N two-state particles, S = N k_B \ln 2 — the
entropy is simply \ln 2 per particle (one "bit" each), times
k_B.
The second law, as a statement about counting
Now the payoff. Release a gas that starts crammed into the left half of a box (a low-multiplicity, and
therefore low-entropy, macrostate). Molecules bounce around at random. At every instant the system is
equally likely to be in any accessible microstate — and there are vastly more microstates
with the gas spread through the whole box than clustered on one side. So overwhelmingly the system is
found spread out, and it stays spread out. The macrostate drifts toward larger
\Omega, hence larger S
— not because anything pushes it, but because that is where almost all the possibilities are.
-
An isolated system, left to itself, evolves toward the macrostate of maximum
multiplicity, so its entropy S = k_B \ln \Omega tends to
increase: \Delta S \ge 0.
-
This is a statistical law, not an absolute mechanical one. A decrease is not
forbidden — it is merely so improbable, for macroscopic N, that it never
happens in practice.
How improbable? For a gas of N molecules to fluctuate all the way back into
one half of the box, the odds are 1 in 2^{N}.
For a mere N = 100 that is already 1 in
10^{30}; for a real gas it is 1 in
2^{(10^{23})}, a number so large that writing its digits would fill more
paper than there are atoms in the galaxy. The system could reverse — the
Poincaré recurrence theorem even guarantees it eventually returns arbitrarily close
to its start — but the expected waiting time dwarfs the age of the universe by factors that make the
word "astronomical" look shy. The second law is not a certainty carved into the microphysics; it is a
near-certainty carved into the arithmetic of large numbers.
Two connected traps, both worth naming out loud.
"Entropy is disorder" is a slogan, not a definition. It is a decent first-day
intuition, but it leaks. "Disorder" is a fuzzy, subjective word — a "messy" room can have low entropy
and a "tidy" one high entropy, depending on how many microscopic arrangements you are actually
blurring together. Salt crystals precipitating out of a cooling solution look more ordered
yet the total entropy of crystal-plus-surroundings still rises. The honest definition is not about
tidiness at all: entropy counts microstates, S = k_B \ln \Omega.
Whenever "disorder" and "microstate-counting" disagree, trust the counting.
The second law is statistical, and it is about the whole isolated
system. Microscopically, nothing forbids entropy from dropping — the equations of motion are
time-reversible, and small systems fluctuate down all the time (this is real and measurable). What the
second law asserts is that for a macroscopic isolated system the total entropy overwhelmingly
increases. And "total" matters: your fridge lowers the entropy of the food inside, but only by dumping
even more entropy into the kitchen. Look at just the food and you will wrongly think the law is
broken; look at the whole isolated system and it holds every time.
In 1867 James Clerk Maxwell imagined a microscopic being — later christened Maxwell's
demon — sitting at a trapdoor between two halves of a gas. It lets fast molecules through one
way and slow ones the other, sorting hot from cold with (apparently) no work, which would lower the
entropy of the gas and violate the second law. For decades it looked like a genuine loophole.
The resolution took until the twentieth century and is delicious: information is
physical. To sort the molecules the demon must measure which are fast and slow and
remember its decisions — and its memory is finite. Landauer and Bennett showed that
erasing one bit of that memory, to reset for the next molecule, unavoidably dumps at
least k_B T \ln 2 of heat — and hence
k_B \ln 2 of entropy — into the surroundings. That book-keeping cost of
information is exactly enough to pay back everything the demon saved. The second law survives, and in
rescuing it we discover that a bit of information and k_B \ln 2 of entropy
are two faces of the same coin — the same \ln 2 per two-state particle we
met in Boltzmann's formula.