Four-Vectors and Invariants

By now relativity can feel like a bag of separate tricks: clocks slow down, rulers shrink, velocities add in a funny way, energy and momentum pick up factors of \gamma. Every result seems to demand its own special formula, and every quantity you measure depends on who's looking. There has to be a better way to organise all this — and there is. It is the single most powerful idea in the whole subject: package quantities into four-vectors, and hunt for the invariants.

An invariant is a number that every observer agrees on, no matter how fast they move — a fixed point in a world where lengths and times slosh around. Once you learn to build quantities out of invariants, relativity stops being a minefield of frame-dependent corrections and becomes almost easy: you compute the invariant in whichever frame is simplest, and its value is then good in all frames. This page shows how to spot and use them, using the compact bookkeeping of index notation and the Einstein summation convention.

The position four-vector

In ordinary space a point needs three numbers, (x, y, z), bundled into a vector \vec r. In spacetime an event needs four — a time and a place — so we bundle them into a four-vector. To keep the units consistent we again use ct for the time slot:

x^\mu = (x^0, x^1, x^2, x^3) = (ct,\ x,\ y,\ z).

The Greek index \mu (mu) runs over 0, 1, 2, 3 — the 0 component is time, the other three are space. (Latin indices i, j, k are reserved for just the space parts 1, 2, 3.) The superscript is an index, not a power — x^2 here means "the second component", the y-coordinate, not "x squared". Writing it this way, the Lorentz transformation becomes a single tidy matrix acting on x^\mu, the exact four-dimensional analogue of rotating a vector.

The Minkowski metric and the invariant interval

In ordinary space the length of a vector comes from the Pythagorean dot product, \vec r \cdot \vec r = x^2 + y^2 + z^2, and that length doesn't change when you rotate your axes — it's a rotational invariant. Spacetime has its own "length", but with a crucial twist: the time part enters with the opposite sign. The recipe for combining components is stored in the Minkowski metric \eta_{\mu\nu} (eta), a 4\times 4 array:

\eta_{\mu\nu} = \begin{pmatrix} 1 & 0 & 0 & 0 \\ 0 & -1 & 0 & 0 \\ 0 & 0 & -1 & 0 \\ 0 & 0 & 0 & -1 \end{pmatrix}.

The metric tells you how to take the "dot product" of two four-vectors. Using the Einstein summation convention — a repeated index, once up and once down, is automatically summed over 0,1,2,3 — the spacetime length-squared of x^\mu is

s^2 = \eta_{\mu\nu}\,x^\mu x^\nu = (x^0)^2 - (x^1)^2 - (x^2)^2 - (x^3)^2 = (ct)^2 - x^2 - y^2 - z^2.

This number s^2 is the invariant interval, and its magic is that every observer computes the same value, even though they disagree about t and about x separately. Time dilation and length contraction are precisely the trade-off that keeps this combination fixed: when one observer's t stretches, their x shifts by just enough to leave (ct)^2 - x^2 untouched. It is the spacetime version of the fact that rotating a rod changes its shadow on the wall but not its true length.

s^2 = \eta_{\mu\nu}\,x^\mu x^\nu = (ct)^2 - x^2 - y^2 - z^2 is the same in every inertial frame.
s^2 > 0: timelike separation — a possible cause-and-effect link; \sqrt{s^2}/c is the proper time a clock would tick between the events.
s^2 < 0: spacelike — no causal link; \sqrt{-s^2} is the proper distance. And s^2 = 0: lightlike, connected by a light ray.

Invariance made visible: hyperbolas, not circles

In ordinary geometry, "all points a fixed distance from the origin" trace a circle — and rotating your axes slides points around that circle without changing their distance. Minkowski geometry does the same thing, but because of the minus sign the curve of constant interval is a hyperbola. A Lorentz "boost" (changing to a moving frame) slides events along these hyperbolas, leaving s^2 fixed. Reveal the figure to see them.

The light cone is the boundary case, the hyperbola's own asymptote, where s^2 = 0. Timelike events sit inside it (on up/down branches), spacelike events outside it (left/right branches). This one picture is the geometry of special relativity: replace "circle" with "hyperbola" and "rotation" with "boost", and all your Euclidean intuition transfers over.

The energy–momentum four-vector

Here's where four-vectors pay off spectacularly. Just as position bundles time and space, we can bundle energy and momentum into one four-vector — the four-momentum:

p^\mu = \left(\frac{E}{c},\ p_x,\ p_y,\ p_z\right).

Energy sits in the time slot (over c, to fix the units) and ordinary momentum fills the three space slots. Now feed it through the very same metric to build its invariant:

\eta_{\mu\nu}\,p^\mu p^\nu = \left(\frac{E}{c}\right)^2 - p_x^2 - p_y^2 - p_z^2 = \frac{E^2}{c^2} - |\vec p|^2 = (mc)^2.

Rearranged, that is exactly the energy–momentum relation E^2 = (pc)^2 + (mc^2)^2 from the last page — but now we see it for what it really is: the length-squared of the four-momentum, and that length is the mass. Mass is simply the invariant "spacetime magnitude" of a particle's energy–momentum, the one number about it that no observer can change. Every frame sees a different E and a different \vec p, but they all reconstruct the same (mc)^2.

The four-momentum p^\mu = (E/c,\ \vec p) has invariant squared-length \eta_{\mu\nu}p^\mu p^\nu = (mc)^2.
Equivalently E^2 - (pc)^2 = (mc^2)^2: the mass is frame-independent even though E and \vec p are not.
Four-momentum is conserved in interactions — all four components at once — which is why particle physicists track it obsessively.

Why invariants make relativity easy

The strategy is always the same, and it turns hard problems into one-liners:

Build the invariant, then evaluate it in the easiest frame. Want a particle's mass? Compute E^2 - (pc)^2 in the lab frame; it equals (mc^2)^2, the value in the rest frame where the maths is trivial. No Lorentz transforming required.
Conservation laws become four equations for the price of one. "Total four-momentum in = total four-momentum out" packs energy conservation and all three momentum conservations into a single vector statement \sum p^\mu_{\text{in}} = \sum p^\mu_{\text{out}}.
If a quantity is a four-vector, its invariant is automatically frame-independent. You get a conserved, agreed-upon number for free, without checking frame by frame.

This is why professionals think in four-vectors. The "special formulas" for time dilation and length contraction are still true, but you rarely need them: the invariant does the heavy lifting.

Worked examples

Example 1 — proper time from the interval. Two ticks of a spaceship's clock happen at the same place on the ship, separated by \Delta t = 5\ \text{s} of ship time, and the ship travels \Delta x = 4 light-seconds in the ground frame during that time. The invariant interval, computed in the ground frame, is

s^2 = (c\Delta t)^2 - \Delta x^2 = (c \cdot 5)^2 - (4c)^2 = c^2(25 - 16) = 9c^2.

The proper time is \Delta\tau = \sqrt{s^2}/c = 3\ \text{s} — exactly what the ship's own clock reads, and every observer agrees, because s^2 is invariant.

Example 2 — mass of an unknown particle. A detector measures a particle with total energy E = 5\ \text{GeV} and momentum pc = 4\ \text{GeV}. Its invariant mass follows immediately from the four-momentum's length:

mc^2 = \sqrt{E^2 - (pc)^2} = \sqrt{5^2 - 4^2} = \sqrt{9} = 3\ \text{GeV}.

We never needed the particle's speed. Any lab, moving any way, measuring its own E and p, reconstructs the same 3\ \text{GeV} mass — that's how new particles are identified from the debris of collisions.

Example 3 — a photon has zero length. For light, E = pc, so its four-momentum invariant is E^2 - (pc)^2 = 0, giving m = 0. The photon's four-momentum is a lightlike (null) four-vector — a nonzero vector with zero length, something impossible in ordinary Euclidean space but perfectly natural once the metric carries a minus sign.

You can, and some books do: they use the "mostly plus" metric \eta_{\mu\nu} = \text{diag}(-1,+1,+1,+1), which just flips the overall sign of every interval (timelike becomes negative). What you can never do is make all four signs the same. The single relative minus sign is the entire physical content — it's what makes spacetime Lorentzian rather than Euclidean, what separates "time" from "space", what bends circles into hyperbolas, and what enforces the cosmic speed limit. Erase the minus and you erase relativity: with four plus signs, "boosts" would be ordinary rotations, there'd be no light cone, no causal order, and no maximum speed. That one minus sign is doing the work of the whole theory. (Just pick a convention and stick with it — mixing them mid-calculation is a classic way to lose a sign.)

The superscript on x^\mu is a label, not an exponent. This trips up everyone at first. In x^\mu = (ct, x, y, z), the symbol x^2 means "component number 2" (the y-coordinate), not "x squared". Two habits keep you safe:

When you genuinely square a component, wrap it: write (x^1)^2 for "the x-component, squared". The interval is (x^0)^2 - (x^1)^2 - (x^2)^2 - (x^3)^2, parentheses and all.
The Einstein summation only fires when an index appears once up and once down (x^\mu x_\mu), which is why the metric \eta_{\mu\nu} (two down-indices) is there to "lower" one of them. Two up indices x^\mu x^\mu is not a valid contraction — you need the metric in between.