Homogeneous Coordinates in Practice

Open any GPU's vertex shader and you will find it doing arithmetic in four dimensions, even for a 3-D world. That fourth number is the homogeneous coordinate w, and it is the single trick that lets one matrix type — a 4\times 4 — rotate, scale and translate. Once you see why, the entire transform pipeline collapses into "multiply by a matrix", over and over.

Points and directions, told apart by one number

Step 1 — append w to every 3-vector. A 3-D quantity (x, y, z) becomes the 4-vector (x, y, z, w). The value of w records what kind of thing it is:

\text{point} = (x, y, z, 1), \qquad \text{direction} = (x, y, z, 0).

A point is a location, so it carries w = 1. A direction ("two units east", a velocity, a surface normal) is the same arrow wherever you stand — it has no location — so it carries w = 0.

Step 2 — build the 4\times 4 with translation in the last column. The linear part (rotation and/or scale) M goes in the top-left 3\times 3 block, the translation \vec{t} in the last column, and the bottom row is (0, 0, 0, 1):

T = \left[\begin{array}{ccc|c} & & & t_x \\ & M & & t_y \\ & & & t_z \\ \hline 0 & 0 & 0 & 1 \end{array}\right].

Why the same matrix moves points but not directions

Step 3 — apply it to a point (w = 1). Take pure translation (M = I) to isolate the effect, and multiply row by row:

\begin{bmatrix} 1 & 0 & 0 & t_x \\ 0 & 1 & 0 & t_y \\ 0 & 0 & 1 & t_z \\ 0 & 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} x \\ y \\ z \\ 1 \end{bmatrix} = \begin{bmatrix} x + t_x \\ y + t_y \\ z + t_z \\ 1 \end{bmatrix}.

The translation column is multiplied by w = 1, so it adds in full: the point slides by \vec{t}, and w stays 1 — still a point.

Step 4 — feed the same matrix a direction (w = 0).

\begin{bmatrix} 1 & 0 & 0 & t_x \\ 0 & 1 & 0 & t_y \\ 0 & 0 & 1 & t_z \\ 0 & 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} d_x \\ d_y \\ d_z \\ 0 \end{bmatrix} = \begin{bmatrix} d_x \\ d_y \\ d_z \\ 0 \end{bmatrix}.

Step 5 — read off the result. Now the translation column is multiplied by w = 0, so it contributes nothing — the direction comes out unchanged. That is exactly right: you can move a location, but "east" is east no matter where you stand. The single number w is what decides whether the translation column gets to act.

Step 6 — collect the payoff. Rotation and scale already live in that top-left block, and now translation lives in the last column of the same matrix shape. So one 4\times 4 can express R, S and T, and chaining transforms is just multiplying 4\times 4 matrices. The whole pipeline becomes a stack of 4\times 4 multiplies — which is precisely the operation GPUs are built to do by the million.

Embed 3-D space in 4-D by appending a coordinate w. Then:

Every quantity is a 4-vector (x, y, z, w).
A point has w = 1 (it has a location); a direction has w = 0 (it does not).
One 4\times 4 matrix expresses rotation, scale and translation together — translation sits in the last column.
Translation only touches points: the last column is multiplied by w, so it adds for w = 1 and vanishes for w = 0.

The w = 0 rule is not pedantry — get it wrong and your lighting breaks. A surface normal is a direction: it says which way a face points, and translating the whole object must not drag the normal off into the distance. Tag it (n_x, n_y, n_z, 0) and the model matrix's translation column is harmlessly zeroed — the normal rotates with the object but never slides.

(There is a subtlety waiting downstream: under a non-uniform scale a normal must be transformed by the inverse-transpose of M, not M itself, or it stops being perpendicular to the surface — a story for transforming normals. But the first rule, before any of that, is simply: a direction carries w = 0.)