The Cantor Set
Here is a shape that ought not to exist. Take the segment [0,1], and
keep throwing away its middle — the middle third, then the middle third of each survivor, forever.
Whatever is left when the dust settles is the Cantor set, and it is a monster of
the friendly kind. It contains so many points that it is
uncountable — just as crowded as
the whole interval you started with — and yet it is so thin that its total length is
zero. Uncountably many points, occupying no length at all. That sentence looks like
a contradiction the first ten times you read it; by the end of this page it will look like a theorem.
Georg Cantor cooked this up in 1883 not as a curiosity but as a stress test — a single object that
breaks nearly every cosy intuition about size, dimension, and "how much" of the line a set can be.
It is the simplest fractal, the cleanest example of
measure zero,
and a set you can draw in ten seconds. Let us build it.
The middle-thirds construction
Start with C_0 = [0,1]. To get the next stage, delete the
open middle third — the interval (\tfrac13, \tfrac23) —
leaving two closed pieces:
C_1 = \left[0, \tfrac13\right] \cup \left[\tfrac23, 1\right].
Now do the same to each of those two pieces, then to each of the four that results, and so
on without end. The Cantor set is what remains after infinitely many deletions — the intersection of
every stage:
\mathcal{C} = \bigcap_{n=0}^{\infty} C_n.
Press Play and watch the middles vanish. Each row is C_n;
every bar splits into two bars a third the width, with a gap punched out of the centre.
Notice the picture never empties. The endpoints of every deleted gap — 0,
1, \tfrac13, \tfrac23,
\tfrac19, \tfrac29, and so on — are
never thrown away: a middle third is always open, so it leaves its own endpoints
behind. Those endpoints alone already give infinitely many survivors. As we are about to see, they
are the barest tip of the iceberg.
Length zero: adding up what we removed
How much did we delete? Track it stage by stage. At the first step we remove one interval of length
\tfrac13. At the second, two intervals each of length
\tfrac19. At the n-th step there are
2^{\,n-1} gaps, each of length 1/3^{\,n}. Sum
the whole lot:
\sum_{n=1}^{\infty} 2^{\,n-1}\cdot\frac{1}{3^{\,n}} = \frac{1}{3}\sum_{n=0}^{\infty}\left(\frac{2}{3}\right)^{n} = \frac{1}{3}\cdot\frac{1}{1-\tfrac23} = \frac{1}{3}\cdot 3 = 1.
We removed a total length of exactly 1 — all of it. Whatever survives can only have
length 1 - 1 = 0. In the language of measure theory, the Cantor set has
Lebesgue measure zero: it can be covered by open intervals whose total length is as
small as you please.
You can see the same fact from the survivors' side. After n steps the set
C_n is 2^{\,n} bars, each of length
1/3^{\,n}, so its total length is (2/3)^{n} —
and (2/3)^{n} \to 0. The chart shows this "leftover length" collapsing to
nothing even as the number of pieces explodes.
- The total length of the deleted middle thirds is
\sum_{n\ge 1} 2^{\,n-1}/3^{\,n} = 1.
- Hence \mathcal{C} has Lebesgue measure
\lambda(\mathcal{C}) = 0.
- Yet \mathcal{C} is uncountable — it has exactly as
many points as [0,1] itself.
Uncountable: the base-3 fingerprint
Measure zero makes the Cantor set sound tiny. Cardinality tells the opposite story. The trick is to
write numbers in base 3 (ternary), where every x \in [0,1]
is a string of digits 0, 1,
2:
x = \frac{d_1}{3} + \frac{d_2}{3^2} + \frac{d_3}{3^3} + \cdots, \qquad d_k \in \{0,1,2\}.
Deleting the middle third (\tfrac13,\tfrac23) deletes exactly the numbers
whose first ternary digit must be 1. Deleting the next middles
removes those forced to have a 1 in the second place, and so on. What
survives is precisely the numbers that can be written using only the digits
0 and 2 — no 1s needed anywhere.
Now map each such number to a binary string by turning every 2 into a
1: 0.20220\ldots_3 \mapsto 0.10110\ldots_2. That
rule hits every real in [0,1], so the Cantor set is in one-to-one
correspondence with all of [0,1] — it is
uncountable, with the same
cardinality as the continuum. The endpoints we noticed earlier are a countable sprinkling; the vast
majority of Cantor points are irrational numbers you would never have guessed were there.
Yes — and it proves the set is far richer than its endpoints. In base 3,
\tfrac14 = 0.020202\ldots_3, an endless run of 0
and 2 with not a single 1. So
\tfrac14 survives every deletion, yet it is never the endpoint of any
removed gap — it sits deep in the "dust", approached from both sides but isolated from none. This is
the flavour of a perfect set: closed, and with no isolated points — every member
has other members arbitrarily close, so you can never point to a Cantor number and say "this one
stands alone".
Self-similar, nowhere dense, and of dimension \log_3 2
Zoom into the left third of the picture and you see a perfect miniature of the whole set, scaled by
\tfrac13. That is self-similarity:
\mathcal{C} is exactly two copies of itself, each shrunk by a factor of
3. Feed N = 2 copies at scale
r = \tfrac13 into the similarity-dimension formula and you get a
fractional dimension:
D = \frac{\ln N}{\ln(1/r)} = \frac{\ln 2}{\ln 3} \approx 0.6309.
A dimension strictly between 0 (a scatter of isolated points) and
1 (a line): the Cantor set is more than countable dust but
less than any curve. This same number classifies it among the
fractals and their
dimensions, where the Cantor set is the humblest and most instructive example.
One more property earns its name: nowhere dense. The Cantor set contains no
interval at all — pick any tiny sub-interval (a,b) of
[0,1], and at some stage a gap wider than the whole survivor punches a
hole clean through it. So \mathcal{C} has empty interior: it is all
boundary, all edge, with no solid "inside" anywhere. Closed, perfect, nowhere dense, measure zero,
uncountable, self-similar — a single set holding six ideas at once.
The headline paradox — uncountable yet measure zero — is not a contradiction, and
the reason is worth pinning down. Cardinality counts how many points there are;
measure asks how much length they occupy. A single point has measure zero, and so does any
countable set (cover the k-th point by an interval of length
\varepsilon/2^{k} and the total is a mere \varepsilon).
The Cantor set shows the converse surprise: measure zero does not force a set to
be small in count — an uncountable set can still be squeezed under covers of vanishing
total length. Two different notions of "size", pulling in opposite directions.
A second trap: the survivors are not just the gap-endpoints. Those endpoints form
only a countable set; the Cantor set is uncountable, so almost all of its points (like
\tfrac14) are interior-of-the-dust numbers that were never any interval's
edge.