Translation, Rotation, Scale

Every object in a game has three things going for it: where it is, which way it faces, and how big it is. Those are governed by exactly three transforms — translation (move it), rotation (turn it) and scale (resize it). Master these three and you can pose anything, from a coin to a cathedral. We take them one at a time, watching each one act on a single point \vec{x} = (x, y).

Scale: stretch each axis

Step 1 — pick a stretch factor per axis. To make something twice as wide and half as tall you multiply x by s_x = 2 and y by s_y = \tfrac12. Independent stretches along the axes are exactly what a diagonal matrix does:

S = \begin{bmatrix} s_x & 0 \\ 0 & s_y \end{bmatrix}, \qquad \text{in 3-D} \quad S = \operatorname{diag}(s_x, s_y, s_z).

Step 2 — apply it to the point. The off-diagonal zeros mean each output coordinate sees only its own input:

S\vec{x} = \begin{bmatrix} s_x & 0 \\ 0 & s_y \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix} = \begin{bmatrix} s_x\, x \\ s_y\, y \end{bmatrix}.

So (3, 4) under \operatorname{diag}(2, \tfrac12) lands at (6, 2). (A negative factor flips that axis — that is a reflection.)

Rotation: turn without warping

Step 3 — read off where the axes go. Turning the plane by \theta sends \hat{\imath} to (\cos\theta, \sin\theta) and \hat{\jmath} to (-\sin\theta, \cos\theta). Stacking those as columns gives the rotation matrix:

R(\theta) = \begin{bmatrix} \cos\theta & -\sin\theta \\ \sin\theta & \cos\theta \end{bmatrix}.

Step 4 — apply it to the point.

R(\theta)\vec{x} = \begin{bmatrix} x\cos\theta - y\sin\theta \\ x\sin\theta + y\cos\theta \end{bmatrix}.

Step 5 — notice what it preserves. Its columns are perpendicular unit vectors, so R^{\top}R = I: R is an orthogonal matrix. That is precisely the algebra of "turn without warping" — lengths and angles survive, nothing stretches, and undoing a turn is just R(\theta)^{-1} = R(-\theta) = R(\theta)^{\top}.

Translation: the one that needs a trick

Step 6 — try to slide with a matrix. A translation by \vec{t} = (t_x, t_y) should send \vec{x} to \vec{x} + \vec{t}. But every linear map fixes the origin — M\vec{0} = \vec{0} — so no plain 2\times 2 matrix can move (0,0) to (t_x, t_y). Translation is not linear.

\vec{x} \;\longmapsto\; \vec{x} + \vec{t} = \begin{bmatrix} x + t_x \\ y + t_y \end{bmatrix}.

Step 7 — fix it with one extra coordinate. Write the point as (x, y, 1) and use a 3\times 3 matrix with the translation tucked into its last column — homogeneous coordinates, the subject of the next lesson:

\begin{bmatrix} 1 & 0 & t_x \\ 0 & 1 & t_y \\ 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} x \\ y \\ 1 \end{bmatrix} = \begin{bmatrix} x + t_x \\ y + t_y \\ 1 \end{bmatrix}.

With that one bookkeeping coordinate, the slide that no 2\times 2 matrix could express becomes an ordinary matrix multiply — and the rotate and scale we just met fit in the same matrix type, so all three live together. That unification is the whole point of what follows.

The three transforms that pose any object are:

These three are not an arbitrary list — they are exactly the data every game object carries. A transform component in an engine (Unity's Transform, Unreal's FTransform) stores precisely

\text{transform} = \big(\underbrace{\vec{t}}_{\text{position}},\; \underbrace{R}_{\text{orientation}},\; \underbrace{\vec{s}}_{\text{size}}\big).

Where it is, which way it faces, how big it is — translation, rotation, scale. Nothing else is needed to place a rigid object in the world. In the lessons ahead we'll bake these three into a single model matrix, one multiply per vertex, and discover that the order you combine them is not optional.