2-D Transforms as Matrices

Open any drawing app, game editor or vector illustrator and you will spend your day doing four things to shapes: sliding them across the canvas, turning them, growing or shrinking them, and occasionally slanting them. On a computer every one of those moves is the same operation underneath — a shape is just a list of corner points, and moving the shape means running each corner through a little matrix. Learn the matrix and you have learned every 2-D transform at once.

A point in the plane is a column vector \begin{bmatrix}x\\y\end{bmatrix}. The engine of computer graphics is the fact that a matrix times a vector is a transformation: a 2\times2 matrix M sends every corner (x,y) to a new corner M\begin{bmatrix}x\\y\end{bmatrix}. Apply the same M to all of a shape's vertices and the whole shape moves as one.

M\begin{bmatrix}x\\y\end{bmatrix} = \begin{bmatrix}a & b\\ c & d\end{bmatrix}\begin{bmatrix}x\\y\end{bmatrix} = \begin{bmatrix}ax + by\\ cx + dy\end{bmatrix}.

The three matrices you use every day

Scale stretches the axes independently — factor s_x along x and s_y along y. Set them equal for a uniform zoom; a negative factor flips (reflects) that axis:

S = \begin{bmatrix} s_x & 0 \\ 0 & s_y \end{bmatrix}.

Rotate spins the plane about the origin by an angle \theta. It is built straight out of sine and cosine — the very same rotation matrix that spins every icon on your screen:

R(\theta) = \begin{bmatrix} \cos\theta & -\sin\theta \\ \sin\theta & \cos\theta \end{bmatrix}.

Shear slants a shape sideways, turning a square into a leaning parallelogram — the "italic" transform. A horizontal shear by factor k pushes points rightwards in proportion to their height:

H = \begin{bmatrix} 1 & k \\ 0 & 1 \end{bmatrix}.

Scale, rotate and shear all fix the origin — the point (0,0) maps to itself, because M\cdot\mathbf{0} = \mathbf{0} for any matrix. That is exactly why the fourth move, translation (sliding), is the odd one out.

Translation: the one that breaks the pattern

Sliding a shape by (t_x, t_y) means adding the same offset to every point, (x, y) \mapsto (x + t_x,\; y + t_y). But no 2\times2 matrix can do this, because every 2\times2 matrix pins the origin in place, and a genuine slide moves the origin. The fix graphics programmers use is beautifully simple: bolt a constant 1 onto each point as a third coordinate and use a 3\times3 affine matrix, where the extra column carries the translation:

\begin{bmatrix} 1 & 0 & t_x \\ 0 & 1 & t_y \\ 0 & 0 & 1 \end{bmatrix}\begin{bmatrix}x\\y\\1\end{bmatrix} = \begin{bmatrix}x + t_x\\ y + t_y\\ 1\end{bmatrix}.

Now translation, rotation, scale and shear all live in one uniform 3\times3 world and can be multiplied together — the trick that makes a transform pipeline possible. It is worth its own page: homogeneous coordinates.

Grab the sliders

The dashed outline is the original little house; the solid one is the house after we scale it by s and then rotate it by \theta — that is, we apply the single matrix M = R(\theta)\,S to every one of its five corners. Slide the scale down past zero and watch the house flip inside out; sweep the angle and watch it spin. The readout shows the live entries of M.

Worked example: transform one corner by hand

Take the corner (3, 1) and scale it by 2 in x and 3 in y:

\begin{bmatrix} 2 & 0 \\ 0 & 3 \end{bmatrix}\begin{bmatrix}3\\1\end{bmatrix} = \begin{bmatrix}2\cdot 3 + 0\cdot 1\\ 0\cdot 3 + 3\cdot 1\end{bmatrix} = \begin{bmatrix}6\\3\end{bmatrix}.

Now shear that result horizontally by k = 2. Only the x-coordinate changes, by k times the height:

\begin{bmatrix} 1 & 2 \\ 0 & 1 \end{bmatrix}\begin{bmatrix}6\\3\end{bmatrix} = \begin{bmatrix}6 + 2\cdot 3\\ 3\end{bmatrix} = \begin{bmatrix}12\\3\end{bmatrix}.

The point that started at (3,1) now sits at (12,3). Do that to all four corners of a square and you get a stretched, slanted parallelogram — a whole shape transformed, one matrix-times-vector at a time.

Order matters, and area is the determinant

Doing scale-then-rotate is the matrix R\,S; doing rotate-then-scale is S\,R — and matrix multiplication does not generally commute, so R\,S \neq S\,R. Swapping the order of two transforms usually gives a different final shape. This trips up beginners constantly, so it is worth burning in: the transform written on the right is applied to the point first.

There is a lovely bonus fact. The determinant of the matrix tells you exactly how much the transform scales area. A pure scale S=\begin{bmatrix}s_x&0\\0&s_y\end{bmatrix} multiplies areas by s_x s_y; a rotation has determinant 1 and so never changes area; a shear also has determinant 1 — a slanted square has the same area as the square it came from. If the determinant is negative, the shape has been flipped (its orientation reversed).

The classic bug: you rotate a sprite that is sitting far from the origin and it swings across the whole canvas instead of turning gently in place. That is because rotation and scale are always about the origin, not about the shape's own centre. A shape whose corners are near (400, 300) gets whipped around the point (0,0) on a radius of 500 pixels.

The fix is the standard "sandwich": translate the shape so its centre is at the origin, rotate (or scale), then translate it back. As matrices, T(+c)\,R(\theta)\,T(-c) — read right to left. This rotate-about-a-point pattern is behind every "spin in place" animation you have ever seen, and forgetting the two translations is one of the most common graphics mistakes of all.

In a real 2-D graphics engine you almost never apply four separate matrices. Instead you multiply the scale, rotation, shear and translation matrices together once into a single combined 3\times3 matrix, then throw every vertex of the shape through that one matrix. Thousands of points, one matrix. When you write CSS like transform: translate(20px,5px) rotate(30deg) scale(1.5), the browser is silently collapsing all of that into a single matrix before it touches a single pixel — you can even read it back out with getComputedStyle as matrix(a, b, c, d, e, f), the six live numbers of exactly the affine matrix on this page.