Measures and Measure Spaces

A σ-algebra answers only half the question. It tells us which subsets of a space X we are allowed to measure — a family \mathcal{F} closed under complements and countable unions, rich enough for analysis, small enough to dodge the non-measurable monsters. But a list of "measurable sets" says nothing about how much each one contains. Naming the sets you may weigh is not the same as owning a scale.

This page installs the scale. A measure \mu is the map that assigns to every set in \mathcal{F} a size in [0, \infty] — a length, an area, a probability, a count — subject to one decisive rule: the size of a disjoint countable pile is the sum of the sizes of its pieces. A σ-algebra and a measure on it, riding on a set X, form a measure space (X, \mathcal{F}, \mu) — the single object on which the whole of integration, and (when \mu(X) = 1) all of probability, is built.

The definition

Fix a measurable space (X, \mathcal{F}). A measure on it is a function

\mu : \mathcal{F} \longrightarrow [0, \infty]

(the value +\infty is genuinely allowed — the whole line has infinite length) satisfying two axioms:

The triple (X, \mathcal{F}, \mu) is a measure space. Everything else on this page — every property you would want a "size" to have — is squeezed out of these two lines.

The word that carries all the weight is countable. Axiom (M2) is not merely finite additivity (split a set in two, the sizes add). It promises the sum survives an infinite sequence of disjoint pieces — and that is exactly the promise that lets sizes commute with limits, the property Riemann's theory so badly lacked. A single disjoint union A \sqcup B is the easy special case; the theory lives in the tail.

A measure on (X, \mathcal{F}) is a set function \mu : \mathcal{F} \to [0, \infty] such that:

The triple (X, \mathcal{F}, \mu) is called a measure space.

Everything follows from two axioms

The axioms look thin. Watch how much they carry — each property below is forced by (M1) and (M2) alone, with a one-line proof.

Finite additivity

Take disjoint A_1, \dots, A_k \in \mathcal{F} and pad the sequence with empty sets: A_{k+1} = A_{k+2} = \dots = \varnothing. These are still pairwise disjoint, so (M2) applies, and by (M1) every padded term contributes 0:

\mu(A_1 \cup \dots \cup A_k) = \sum_{n=1}^{\infty} \mu(A_n) = \sum_{n=1}^{k} \mu(A_n).

So finite additivity is a corollary — which is why we bothered to demand the countable version: the strong axiom hands us the weak one for free, but not conversely.

Monotonicity

Suppose A \subseteq B, both in \mathcal{F}. Split B into the disjoint pieces A and B \setminus A (both measurable — a σ-algebra is closed under differences). By finite additivity,

\mu(B) = \mu(A) + \mu(B \setminus A) \;\ge\; \mu(A),

because \mu(B \setminus A) \ge 0. Bigger set, bigger (or equal) measure — no set can shrink by gaining points. As a bonus, when \mu(A) < \infty we may subtract to get \mu(B \setminus A) = \mu(B) - \mu(A).

Countable subadditivity

Now drop the disjointness. For any A_1, A_2, \dots \in \mathcal{F} (overlapping freely),

\mu\!\left(\bigcup_{n} A_n\right) \;\le\; \sum_{n} \mu(A_n).

The trick is disjointification: set B_1 = A_1 and B_n = A_n \setminus (A_1 \cup \dots \cup A_{n-1}). The B_n are disjoint, each B_n \subseteq A_n, and \bigcup B_n = \bigcup A_n. So by (M2) then monotonicity,

\mu\!\left(\bigcup_n A_n\right) = \sum_n \mu(B_n) \le \sum_n \mu(A_n).

Equality when the sets are disjoint (that is (M2)); a strict deficit whenever they overlap — you paid for the overlap twice on the right.

Continuity from below

A measure respects increasing limits. If A_1 \subseteq A_2 \subseteq A_3 \subseteq \dots with union A = \bigcup_n A_n (written A_n \uparrow A), then

\mu(A_n) \;\uparrow\; \mu(A) \qquad \text{i.e.} \qquad \lim_{n\to\infty}\mu(A_n) = \mu\!\left(\bigcup_n A_n\right).

Peel the tower into disjoint rings C_1 = A_1, C_n = A_n \setminus A_{n-1}. Then A_k = \bigcup_{n \le k} C_n and A = \bigcup_n C_n, so by (M2) \mu(A) = \sum_n \mu(C_n) = \lim_k \sum_{n\le k}\mu(C_n) = \lim_k \mu(A_k). Countable additivity is precisely this continuity — the two are equivalent given finite additivity.

Continuity from above (with a catch)

The decreasing version holds too — but only if you start from finite measure. If A_1 \supseteq A_2 \supseteq \dots with A_n \downarrow A = \bigcap_n A_n, and \mu(A_1) < \infty, then

\mu(A_n) \;\downarrow\; \mu(A).

Apply continuity from below to the growing complements A_1 \setminus A_n \uparrow A_1 \setminus A, and use \mu(A_1 \setminus A_n) = \mu(A_1) - \mu(A_n) (legal because \mu(A_1) < \infty). The finiteness is not a technicality — the next box shows the theorem is false without it.

The measures you already know (and some you don't)

Abstract axioms deserve concrete inhabitants. Every one of these is a bona fide measure on some (X, \mathcal{F}).

A measure with \mu(X) = 1 is a probability measure: the total mass is exactly one unit, so \mu(A) reads as "the probability of A". Every axiom of probability — that the whole sample space has probability 1, that disjoint events add — is just a measure with total mass 1. Probability theory is measure theory with a normalization.

Null sets, "almost everywhere", and completeness

Sets of measure zero are the dust the theory learns to ignore. A set N \in \mathcal{F} is a null set (or μ-null) if \mu(N) = 0. Under Lebesgue measure the rationals, every finite or countable set, and even the uncountable Cantor set are null — negligible for the purposes of integration.

A property is said to hold almost everywhere (abbreviated a.e., or almost surely in probability) if the set of points where it fails is null. Dirichlet's function is 0 a.e. because it is non-zero only on \mathbb{Q} — a null set — which is precisely why its Lebesgue integral is 0. "Almost everywhere" is the phrase that lets us change a function on a null set without changing a single integral.

One subtlety earns a name. A measure is complete if every subset of a null set is itself measurable (and hence null): whenever \mu(N) = 0 and M \subseteq N, we insist M \in \mathcal{F}. This seems obvious — a subset of something negligible should be negligible — but a raw σ-algebra need not contain all such M. The Borel sets under Lebesgue measure are incomplete; one completes them by adjoining every subset of every Borel null set, giving the (complete) Lebesgue σ-algebra. Completeness removes annoying "is this scrap even measurable?" caveats from later theorems.

See it: the parts add up to the whole

Countable additivity is a statement about an infinite disjoint pile, so let us watch one converge. Split [0, 1) into disjoint intervals

I_1 = \left[0, \tfrac12\right),\quad I_2 = \left[\tfrac12, \tfrac34\right),\quad I_3 = \left[\tfrac34, \tfrac78\right),\ \dots,\quad I_n = \left[1 - \tfrac{1}{2^{n-1}},\, 1 - \tfrac{1}{2^{n}}\right),

so I_n has Lebesgue measure \lambda(I_n) = 2^{-n}. The pieces are pairwise disjoint and their union is all of [0,1), so (M2) demands \sum_n 2^{-n} = 1. Drag the slider to lay down more pieces and watch the running total \sum_{k=1}^{n} \lambda(I_k) = 1 - 2^{-n} creep up to the measure of the whole interval — never overshooting, always converging.

This is also continuity from below in miniature: the partial unions A_n = I_1 \cup \dots \cup I_n = [0, 1 - 2^{-n}) increase up to [0,1), and their measures 1 - 2^{-n} increase up to 1. Additivity and continuity are the same fact wearing two hats.

The prerequisite page's four-wish list already showed we cannot have everything. Here the pressure lands squarely on the word countable. If we asked only for finite additivity, strange creatures called finitely-additive measures exist — one can, using the axiom of choice, build a finitely-additive, translation-invariant "measure" defined on every subset of [0,1) with total mass 1. It dodges Vitali's contradiction precisely because it refuses to sum over the countably many rational translates. The same loophole, in three dimensions, is the engine of the Banach–Tarski paradox: a solid ball cut into finitely many (non-measurable) pieces and reassembled into two balls of the same size — finite additivity alone cannot forbid it.

Countable additivity is the extra strength that outlaws these pathologies and, far more importantly, is exactly the hypothesis that makes measure continuous along increasing and decreasing sequences. Without continuity along limits there is no Monotone Convergence Theorem, no Dominated Convergence, no interchange of limit and integral — no Lebesgue theory at all. We pay by restricting to a σ-algebra rather than all subsets; we are repaid with limits that behave.

Four traps waiting the moment you start computing with measures: