Transforming Normals

A surface normal is the little arrow sticking straight out of a surface, perpendicular to it. Lighting lives or dies by it: how bright a face looks is set by the angle between its normal and the light. So when you transform a mesh — scale a character, stretch a crate — the normals must come along. The obvious move is to push them through the model matrix M, the same matrix that moves the vertices. Tempting, and wrong. Under a non-uniform scale or a shear, applying M to a normal tilts it off the surface, and your lighting goes subtly, maddeningly bad.

First, normals are directions, not places: they carry w = 0 in homogeneous coordinates, so translation never touches them. Only the linear part of M — rotation, scale, shear — can do damage. Let us find what does the job correctly, by demanding the one thing a normal must never lose: perpendicularity.

Deriving the normal matrix

Fix a point on the surface. Let \vec{t} be a tangent there (an arrow lying along the surface) and \vec{n} the normal. Perpendicular means their dot product is zero. Writing the dot product as a row times a column:

\vec{n}^{\top}\vec{t} = 0.

Step 1 — see how a tangent transforms. A tangent is the difference of two nearby surface points, so it rides the vertices: under the linear part M it becomes

\vec{t}\,' = M\vec{t}.

Step 2 — demand the new normal still be perpendicular. Call the unknown, correctly transformed normal \vec{n}\,'. Whatever it is, it must satisfy the same perpendicularity against the new tangent:

(\vec{n}\,')^{\top}\,\vec{t}\,' = 0.

Step 3 — substitute the transformed tangent. Put \vec{t}\,' = M\vec{t} in:

(\vec{n}\,')^{\top} M \vec{t} = 0.

Step 4 — engineer the middle to collapse to the identity. We already know \vec{n}^{\top}\vec{t} = 0. If we could massage (\vec{n}\,')^{\top} M into \vec{n}^{\top}, we'd be done. The clean way: insert M^{-1}M = I and group it so an M^{-1}M sits in the middle. Suppose \vec{n}\,' = (M^{-1})^{\top}\vec{n}; then (\vec{n}\,')^{\top} = \vec{n}^{\top} M^{-1} (using (AB)^{\top} = B^{\top}A^{\top} and ((M^{-1})^{\top})^{\top} = M^{-1}), and so

(\vec{n}\,')^{\top} M \vec{t} = \vec{n}^{\top} M^{-1} M \vec{t} = \vec{n}^{\top} \vec{t} = 0. \;\checkmark

Step 5 — read off the normal matrix. Perpendicularity is preserved exactly when the normal is transformed not by M, but by the inverse-transpose of its linear part — the normal matrix:

\vec{n}\,' = (M^{-1})^{\top}\,\vec{n} = M^{-\top}\vec{n}.

Step 6 — check the case where nothing should change: a pure rotation. If M is a rotation R, it is orthogonal, so R^{-1} = R^{\top} and therefore R^{-\top} = (R^{\top})^{\top} = R. The normal matrix collapses back to M itself:

R^{-\top} = R \quad\Longrightarrow\quad \vec{n}\,' = R\,\vec{n}.

So a rotation (or any rigid move) transforms normals exactly like vertices — no special care needed. The normal matrix only earns its keep under non-uniform scale or shear, precisely the cases where M\vec{n} would have tilted off the surface. That is the whole story: rotate freely, but the moment you squash unevenly, switch to M^{-\top}.

Let a mesh be transformed by the linear part M of its model matrix (normals are directions, w = 0, so translation is irrelevant).

Applying M directly to a normal (M\vec{n}) breaks under non-uniform scale or shear — the normal tilts off the surface.
Transform normals by the normal matrix (M^{-1})^{\top} = M^{-\top} instead.
It is derived by requiring \vec{n}^{\top}\vec{t} = 0 to survive: (\vec{n}\,')^{\top}\vec{t}\,' = (M^{-\top}\vec{n})^{\top}(M\vec{t}) = \vec{n}^{\top} M^{-1} M \vec{t} = \vec{n}^{\top}\vec{t} = 0, so perpendicularity is preserved.
For a pure rotation M^{-\top} = M, so normals transform like vertices — only scale and shear need the normal matrix.

The classic graphics ghost story. Your sphere lights perfectly. You scale it flat into a coin — non-uniform scale, say (1, 1, 0.2) — and now the shading is off: highlights crawl to the wrong spots, edges that should catch the light go dim, the whole thing looks faintly plastic. The vertices are fine; the normals are lying. Pushing them through the same scale matrix M squashed them toward the flattened axis, so they no longer point out of the surface — M\vec{n} tilted them inward.

The one-line fix every engine ships: upload (M^{-1})^{\top} as the normal matrix and multiply normals by that in the shader (renormalising afterward, since the length can change). Under the flatten, the inverse-transpose stretches the normal back along the squashed axis — exactly opposite to the scale — and perpendicularity is restored. Same mesh, same light, correct shading. And if your transform is only ever rotation and uniform scale, you can skip the whole dance: M^{-\top} is just M up to a harmless scalar you normalise away.

Wrong vs right, side by side

A surface (the bar) carries one normal. Drag the vertical scale slider to squash the surface non-uniformly. Two arrows respond: the wrong one applies the model matrix straight to the normal (M\vec{n}) and visibly tilts off the surface; the right one uses the normal matrix ((M^{-1})^{\top}\vec{n}) and stays glued perpendicular. Slide back to a scale of 1 (uniform) and the two arrows coincide — the bug only appears once the scale goes non-uniform. The little angle read-out tells you how far the wrong normal has drifted from 90^{\circ}.