A surface normal is the little arrow sticking straight out of a surface,
perpendicular to it. Lighting lives or dies by it: how bright a face looks is set by the angle
between its normal and the light. So when you transform a mesh — scale a character, stretch a
crate — the normals must come along. The obvious move is to push them through the
model matrix M, the same matrix that moves the
vertices. Tempting, and wrong. Under a non-uniform scale or a shear, applying
M to a normal tilts it off the surface, and your lighting goes
subtly, maddeningly bad.
First, normals are directions, not places: they carry w = 0
in homogeneous coordinates, so translation never touches them. Only the linear part of
M — rotation, scale, shear — can do damage. Let us find what does
the job correctly, by demanding the one thing a normal must never lose: perpendicularity.
Deriving the normal matrix
Fix a point on the surface. Let \vec{t} be a
tangent there (an arrow lying along the surface) and
\vec{n} the normal. Perpendicular means their
dot
product is zero. Writing the dot product as a row times a column:
\vec{n}^{\top}\vec{t} = 0.
Step 1 — see how a tangent transforms. A tangent is the difference of two
nearby surface points, so it rides the vertices: under the linear part
M it becomes
\vec{t}\,' = M\vec{t}.
Step 2 — demand the new normal still be perpendicular. Call the unknown,
correctly transformed normal \vec{n}\,'. Whatever it is, it must
satisfy the same perpendicularity against the new tangent:
(\vec{n}\,')^{\top}\,\vec{t}\,' = 0.
Step 3 — substitute the transformed tangent. Put
\vec{t}\,' = M\vec{t} in:
(\vec{n}\,')^{\top} M \vec{t} = 0.
Step 4 — engineer the middle to collapse to the identity. We already know
\vec{n}^{\top}\vec{t} = 0. If we could massage
(\vec{n}\,')^{\top} M into \vec{n}^{\top},
we'd be done. The clean way: insert M^{-1}M = I and group it so an
M^{-1}M sits in the middle. Suppose
\vec{n}\,' = (M^{-1})^{\top}\vec{n}; then
(\vec{n}\,')^{\top} = \vec{n}^{\top} M^{-1}
(using (AB)^{\top} = B^{\top}A^{\top} and
((M^{-1})^{\top})^{\top} = M^{-1}), and so
(\vec{n}\,')^{\top} M \vec{t} = \vec{n}^{\top} M^{-1} M \vec{t} = \vec{n}^{\top} \vec{t} = 0. \;\checkmark
Step 5 — read off the normal matrix. Perpendicularity is preserved exactly
when the normal is transformed not by M, but by the
inverse-transpose of its linear part — the
normal matrix:
\vec{n}\,' = (M^{-1})^{\top}\,\vec{n} = M^{-\top}\vec{n}.
Step 6 — check the case where nothing should change: a pure rotation. If
M is a rotation R, it is orthogonal, so
R^{-1} = R^{\top} and therefore
R^{-\top} = (R^{\top})^{\top} = R. The normal matrix collapses back
to M itself:
R^{-\top} = R \quad\Longrightarrow\quad \vec{n}\,' = R\,\vec{n}.
So a rotation (or any rigid move) transforms normals exactly like vertices — no special care
needed. The normal matrix only earns its keep under non-uniform scale or shear,
precisely the cases where M\vec{n} would have tilted off the surface.
That is the whole story: rotate freely, but the moment you squash unevenly, switch to
M^{-\top}.
Let a mesh be transformed by the linear part M of its model matrix
(normals are directions, w = 0, so translation is irrelevant).
-
Applying M directly to a normal
(M\vec{n}) breaks under non-uniform scale or shear — the normal
tilts off the surface.
-
Transform normals by the normal matrix
(M^{-1})^{\top} = M^{-\top} instead.
-
It is derived by requiring \vec{n}^{\top}\vec{t} = 0 to survive:
(\vec{n}\,')^{\top}\vec{t}\,' = (M^{-\top}\vec{n})^{\top}(M\vec{t}) = \vec{n}^{\top} M^{-1} M \vec{t} = \vec{n}^{\top}\vec{t} = 0,
so perpendicularity is preserved.
-
For a pure rotation M^{-\top} = M, so normals transform like
vertices — only scale and shear need the normal matrix.
The classic graphics ghost story. Your sphere lights perfectly. You scale it flat into a
coin — non-uniform scale, say (1, 1, 0.2) — and now the shading is
off: highlights crawl to the wrong spots, edges that should catch the light go dim, the whole
thing looks faintly plastic. The vertices are fine; the normals are lying. Pushing
them through the same scale matrix M squashed them
toward the flattened axis, so they no longer point out of the surface —
M\vec{n} tilted them inward.
The one-line fix every engine ships: upload (M^{-1})^{\top} as the
normal matrix and multiply normals by that in the shader (renormalising afterward, since the
length can change). Under the flatten, the inverse-transpose stretches the normal
back along the squashed axis — exactly opposite to the scale — and perpendicularity is
restored. Same mesh, same light, correct shading. And if your transform is only ever
rotation and uniform scale, you can skip the whole dance:
M^{-\top} is just M up to a harmless
scalar you normalise away.