The Cantor Set

Here is a shape that ought not to exist. Take the segment [0,1], and keep throwing away its middle — the middle third, then the middle third of each survivor, forever. Whatever is left when the dust settles is the Cantor set, and it is a monster of the friendly kind. It contains so many points that it is uncountable — just as crowded as the whole interval you started with — and yet it is so thin that its total length is zero. Uncountably many points, occupying no length at all. That sentence looks like a contradiction the first ten times you read it; by the end of this page it will look like a theorem.

Georg Cantor cooked this up in 1883 not as a curiosity but as a stress test — a single object that breaks nearly every cosy intuition about size, dimension, and "how much" of the line a set can be. It is the simplest fractal, the cleanest example of measure zero, and a set you can draw in ten seconds. Let us build it.

The middle-thirds construction

Start with C_0 = [0,1]. To get the next stage, delete the open middle third — the interval (\tfrac13, \tfrac23) — leaving two closed pieces:

C_1 = \left[0, \tfrac13\right] \cup \left[\tfrac23, 1\right].

Now do the same to each of those two pieces, then to each of the four that results, and so on without end. The Cantor set is what remains after infinitely many deletions — the intersection of every stage:

\mathcal{C} = \bigcap_{n=0}^{\infty} C_n.

Press Play and watch the middles vanish. Each row is C_n; every bar splits into two bars a third the width, with a gap punched out of the centre.

Notice the picture never empties. The endpoints of every deleted gap — 0, 1, \tfrac13, \tfrac23, \tfrac19, \tfrac29, and so on — are never thrown away: a middle third is always open, so it leaves its own endpoints behind. Those endpoints alone already give infinitely many survivors. As we are about to see, they are the barest tip of the iceberg.

Length zero: adding up what we removed

How much did we delete? Track it stage by stage. At the first step we remove one interval of length \tfrac13. At the second, two intervals each of length \tfrac19. At the n-th step there are 2^{\,n-1} gaps, each of length 1/3^{\,n}. Sum the whole lot:

\sum_{n=1}^{\infty} 2^{\,n-1}\cdot\frac{1}{3^{\,n}} = \frac{1}{3}\sum_{n=0}^{\infty}\left(\frac{2}{3}\right)^{n} = \frac{1}{3}\cdot\frac{1}{1-\tfrac23} = \frac{1}{3}\cdot 3 = 1.

We removed a total length of exactly 1 — all of it. Whatever survives can only have length 1 - 1 = 0. In the language of measure theory, the Cantor set has Lebesgue measure zero: it can be covered by open intervals whose total length is as small as you please.

You can see the same fact from the survivors' side. After n steps the set C_n is 2^{\,n} bars, each of length 1/3^{\,n}, so its total length is (2/3)^{n} — and (2/3)^{n} \to 0. The chart shows this "leftover length" collapsing to nothing even as the number of pieces explodes.

Uncountable: the base-3 fingerprint

Measure zero makes the Cantor set sound tiny. Cardinality tells the opposite story. The trick is to write numbers in base 3 (ternary), where every x \in [0,1] is a string of digits 0, 1, 2:

x = \frac{d_1}{3} + \frac{d_2}{3^2} + \frac{d_3}{3^3} + \cdots, \qquad d_k \in \{0,1,2\}.

Deleting the middle third (\tfrac13,\tfrac23) deletes exactly the numbers whose first ternary digit must be 1. Deleting the next middles removes those forced to have a 1 in the second place, and so on. What survives is precisely the numbers that can be written using only the digits 0 and 2 — no 1s needed anywhere.

Now map each such number to a binary string by turning every 2 into a 1: 0.20220\ldots_3 \mapsto 0.10110\ldots_2. That rule hits every real in [0,1], so the Cantor set is in one-to-one correspondence with all of [0,1] — it is uncountable, with the same cardinality as the continuum. The endpoints we noticed earlier are a countable sprinkling; the vast majority of Cantor points are irrational numbers you would never have guessed were there.

Yes — and it proves the set is far richer than its endpoints. In base 3, \tfrac14 = 0.020202\ldots_3, an endless run of 0 and 2 with not a single 1. So \tfrac14 survives every deletion, yet it is never the endpoint of any removed gap — it sits deep in the "dust", approached from both sides but isolated from none. This is the flavour of a perfect set: closed, and with no isolated points — every member has other members arbitrarily close, so you can never point to a Cantor number and say "this one stands alone".

Self-similar, nowhere dense, and of dimension \log_3 2

Zoom into the left third of the picture and you see a perfect miniature of the whole set, scaled by \tfrac13. That is self-similarity: \mathcal{C} is exactly two copies of itself, each shrunk by a factor of 3. Feed N = 2 copies at scale r = \tfrac13 into the similarity-dimension formula and you get a fractional dimension:

D = \frac{\ln N}{\ln(1/r)} = \frac{\ln 2}{\ln 3} \approx 0.6309.

A dimension strictly between 0 (a scatter of isolated points) and 1 (a line): the Cantor set is more than countable dust but less than any curve. This same number classifies it among the fractals and their dimensions, where the Cantor set is the humblest and most instructive example.

One more property earns its name: nowhere dense. The Cantor set contains no interval at all — pick any tiny sub-interval (a,b) of [0,1], and at some stage a gap wider than the whole survivor punches a hole clean through it. So \mathcal{C} has empty interior: it is all boundary, all edge, with no solid "inside" anywhere. Closed, perfect, nowhere dense, measure zero, uncountable, self-similar — a single set holding six ideas at once.

The headline paradox — uncountable yet measure zero — is not a contradiction, and the reason is worth pinning down. Cardinality counts how many points there are; measure asks how much length they occupy. A single point has measure zero, and so does any countable set (cover the k-th point by an interval of length \varepsilon/2^{k} and the total is a mere \varepsilon). The Cantor set shows the converse surprise: measure zero does not force a set to be small in count — an uncountable set can still be squeezed under covers of vanishing total length. Two different notions of "size", pulling in opposite directions.

A second trap: the survivors are not just the gap-endpoints. Those endpoints form only a countable set; the Cantor set is uncountable, so almost all of its points (like \tfrac14) are interior-of-the-dust numbers that were never any interval's edge.