Cantor's Theorem

Cantor's diagonal trick did more than show the reals are uncountable — it hides a completely general law. For any set A at all, finite or infinite, its power set is strictly bigger:

|A| \;<\; |\mathcal{P}(A)|.

For a finite set this is just n < 2^n. But the theorem holds for infinite sets too — and that single fact detonates into an endless tower of infinities, each bigger than the last, with no top.

The proof: a diagonal set

There is always an injection A \to \mathcal{P}(A) — send each element a to the singleton \{a\} — so |A| \le |\mathcal{P}(A)|. The content is that there is no surjection (hence no bijection), so the inequality is strict.

Suppose, for contradiction, that some function f : A \to \mathcal{P}(A) were onto. Build the diagonal set of all elements that are not in their own image:

D = \{\, a \in A : a \notin f(a) \,\}.

D is a subset of A, so if f is onto, D = f(a_0) for some a_0. Now ask the fatal question: is a_0 \in D?

If a_0 \in D, then by D's definition a_0 \notin f(a_0) = D — contradiction.
If a_0 \notin D, then a_0 satisfies the membership rule, so a_0 \in D — contradiction.

Either way, disaster. So no such onto f exists, and |A| < |\mathcal{P}(A)|. It is the diagonal argument again, dressed in pure set language — a_0 \in D \iff a_0 \notin D is the same self-reference that changed the n-th digit of the n-th number.

For every set A, |A| < |\mathcal{P}(A)| — the power set is strictly larger.
Consequently there is no largest set and no largest infinity: \aleph_0 < 2^{\aleph_0} < 2^{2^{\aleph_0}} < \cdots.
In particular |\mathbb{N}| < |\mathcal{P}(\mathbb{N})|, and |\mathcal{P}(\mathbb{N})| = |\mathbb{R}| — the continuum.

Infinities without end

Apply Cantor's theorem over and over. Start with \mathbb{N} of size \aleph_0; its power set is bigger; that set's power set is bigger still; and so on forever. There is an unending hierarchy of larger and larger infinities, and no "set of everything" can sit at the top — because its power set would have to be bigger than it.

Cantor's theorem forbids a "universal set" U containing everything. If U held every set, it would in particular contain all of its own subsets, so \mathcal{P}(U) \subseteq U and hence |\mathcal{P}(U)| \le |U| — flatly contradicting |U| < |\mathcal{P}(U)|. This is the same self-referential nerve that Russell's paradox strikes ("the set of all sets that don't contain themselves"), and it is exactly why modern set theory is built from careful axioms rather than the naive "any collection is a set."

The inequality is strict and needs the "no surjection" half. The easy injection a \mapsto \{a\} only gives \le; the diagonal set is what rules out equality.
The theorem holds for every set, infinite ones included — that's the whole point. It is not merely the finite fact n < 2^n.