Universal Gate Sets

Here is a small miracle of classical computing: every logic circuit ever built — a calculator, a web browser, a whole processor — can be assembled from copies of a single gate, the NAND. You never need a special part for multiplication or for sorting; wire up enough NANDs and you can build any Boolean function at all. A set of gates with this "builds everything" power is called universal.

Quantum computing has the same gift, and this page is about it. We have already met a handful of gates — the single-qubit gates acting on one wire of a circuit, and the phase gates S and T. The question is: which small kit of gates lets you build any quantum computation? The answer is a beautiful, slightly surprising one — and it is the reason one particular gate, the T, turns out to be the most precious resource in the whole machine.

A universal gate set is a finite collection of gates whose circuits can approximate any unitary on any number of qubits, to whatever accuracy you ask for. The standard choice is just three gates:

\{\,H,\; T,\; \mathrm{CNOT}\,\}.

A wire, a Hadamard, a T, and a two-qubit CNOT — from these you can compile Shor's algorithm, a chemistry simulation, anything. The picture below shows the idea: a target gate U being built up as a sequence of primitives from the kit.

What each gate contributes

Universality has two jobs, and the set \{H, T, \mathrm{CNOT}\} splits them cleanly.

First, you must be able to make any single-qubit gate. A single-qubit unitary is a rotation of the Bloch sphere, and there are a continuous infinity of them. Remarkably, the two gates H and T alone are enough: by stringing together long words like H\,T\,H\,T\,T\,H\,T\cdots you can land arbitrarily close to any rotation you want. (We sketch why below.)

Second, you need entangling power — something that makes two qubits interact, which no one-qubit gate can do. That is the CNOT. A famous fact makes the division of labour exact: any multi-qubit unitary can be decomposed into single-qubit gates plus CNOTs. So once H and T give you every single-qubit gate, and CNOT stitches qubits together, you have everything.

You will often see the equivalent phrasing Clifford + T, the set \{H, S, \mathrm{CNOT}, T\}. It adds the phase gate S for convenience, but this buys no new power: as we are about to see, S = T^2, so S is already hiding inside the Ts.

Worked example: building Z out of phase gates

Let us see gates literally compile into other gates. Recall the diagonal phase gates

T = \begin{bmatrix} 1 & 0 \\ 0 & e^{i\pi/4} \end{bmatrix}, \qquad S = \begin{bmatrix} 1 & 0 \\ 0 & i \end{bmatrix}, \qquad Z = \begin{bmatrix} 1 & 0 \\ 0 & -1 \end{bmatrix}.

Applying a gate twice multiplies the diagonal entries. Since the top-left entry is always 1, we just square the bottom-right phase. Two Ts give

T^2 = \begin{bmatrix} 1 & 0 \\ 0 & (e^{i\pi/4})^2 \end{bmatrix} = \begin{bmatrix} 1 & 0 \\ 0 & e^{i\pi/2} \end{bmatrix} = \begin{bmatrix} 1 & 0 \\ 0 & i \end{bmatrix} = S.

So S = T^2 — the "quarter-turn" phase is exactly two "eighth-turn" phases. Square once more:

Z = S^2 = \begin{bmatrix} 1 & 0 \\ 0 & i^2 \end{bmatrix} = \begin{bmatrix} 1 & 0 \\ 0 & -1 \end{bmatrix}, \qquad\text{so}\qquad Z = S^2 = T^4.

Reading it as a circuit: the gate Z is just four Ts in a row. This is compilation in miniature — a gate you want, rewritten entirely in terms of gates from your kit.

Worked example: why H and T fill the sphere

How can two fixed gates possibly approximate a continuous infinity of rotations? The trick is an irrational angle. The gate T is a rotation of the Bloch sphere by \pi/4 about the z-axis. The Hadamard H swaps which axis you are turning about (it maps the z-axis to the x-axis and back). So the combination H T H is a \pi/4 rotation about a different axis.

Now compose the two: turn a bit about z, then a bit about x, over and over. The key fact is that the resulting single rotation THTH is by an angle that is an irrational multiple of \pi. Repeating an irrational-angle rotation \theta,\,2\theta,\,3\theta,\dots never exactly repeats and eventually passes arbitrarily close to every angle around that axis — the powers are dense on the circle. Combine dense rotations about two different axes and you can steer the state arrow to anywhere on the sphere, as close as you like.

That is the whole engine of single-qubit universality: not that H and T hit every rotation, but that their products come arbitrarily close to every rotation. Which is exactly the distinction we look at next.

Exact vs approximate universality

There is a catch you must respect. A finite set of gates can only ever produce a countable list of circuits — you can enumerate every finite word in H and T. But the single-qubit unitaries form a continuous (uncountable) family. A countable list can never equal an uncountable set, so no finite gate set hits every unitary exactly. Some target rotation, by some irrational angle, will always be missed on the nose.

Universality is therefore an approximation statement. Given a target U and any tolerance \varepsilon > 0, there is a gate sequence V from the set with

\lVert U - V \rVert < \varepsilon.

You cannot get \varepsilon = 0, but you can make \varepsilon as tiny as any experiment could ever detect. The worry, then, is cost: does chasing a smaller \varepsilon demand an explosion of gates? The Solovay–Kitaev theorem gives the reassuring answer — the overhead is mild. To reach accuracy \varepsilon you need only

O\!\left(\log^{c}\tfrac{1}{\varepsilon}\right) \quad\text{gates}, \qquad c \approx 2\text{–}4.

Because the cost grows like a power of the logarithm of 1/\varepsilon, halving your error again and again adds only a handful of gates each time. Approximate universality is thus not a weakness — it is universality with a gentle, entirely affordable price.

The Clifford trap: why T is not optional

It is tempting to think that once you have the "nice" gates — H, S, and CNOT — you are basically done, and T is a minor extra. That is exactly backwards. The set \{H, S, \mathrm{CNOT}\} is called the Clifford group, and on its own it is not universal. Worse (for anyone hoping for a speed-up), the Gottesman–Knill theorem says that any circuit built only from Clifford gates can be efficiently simulated on an ordinary classical computer. A pure-Clifford "quantum" computer is no faster than your laptop.

So the entire quantum advantage has to come from somewhere outside the Clifford group. Adding the single non-Clifford gate T is precisely what tips the balance: it promotes the set to full universality and makes the circuits classically hard to simulate. The modest-looking \pi/4 phase is not a footnote to the Clifford gates — it is the one ingredient that makes a quantum computer worth building.

On a real fault-tolerant quantum computer, not all gates cost the same. The Clifford gates (H, S, CNOT) are relatively cheap: an error-corrected machine can apply them almost for free. The T gate is the diva. Implementing a single fault-tolerant T typically requires a whole side-factory that distills a special "magic state" from many noisy copies — a process that can dominate the qubit count and the runtime of the entire computation.

This is why engineers obsess over the T-count and T-depth of an algorithm: the number of T gates is a headline measure of how hard a quantum program is to run. Whole research fields exist just to rewrite circuits using fewer Ts. The lesson of universality — that T is the one gate carrying the quantum magic — has a hardware echo: it is also the one gate you most want to spend sparingly.

Two traps sit close together here. First, do not assume a circuit full of quantum gates is doing anything a classical computer cannot. If every gate is a Clifford gate (H, S, \mathrm{CNOT}, and their friends), Gottesman–Knill guarantees you can simulate the whole thing efficiently on a laptop — no speed-up at all. A genuine quantum advantage needs a non-Clifford resource such as the T gate. "Uses qubits" is not the same as "is faster."

Second, do not overclaim what a finite gate set delivers. It gives approximate universality, not exact: you can get within any \varepsilon of a target unitary, but you cannot reproduce an arbitrary continuous rotation exactly with a finite number of H and T gates. The equality Z = T^4 is exact only because Z happens to be a power of T; a generic rotation is merely approached, never nailed.