Rasterization

A modern GPU chews through billions of triangles every second, and for each one it must answer the same tiny question over and over, a few million times a frame: which pixels does this triangle cover, and what colour and depth do they get? Turning a clean geometric triangle — three corners in space, already flattened onto the screen by projection — into a solid block of lit-up pixels is called rasterization (or scan conversion). It is the beating heart of every real-time renderer, and the whole thing rests on one astonishingly simple piece of arithmetic: the edge function.

Unlike a vector drawing, which stores shapes as maths, a raster image is a grid of pixels. Rasterization is the bridge between the two worlds: it takes the maths of a triangle and decides, pixel by pixel, whether each little square is inside or outside.

The edge function: which side of a line am I on?

Take one directed edge of the triangle, running from a first corner (x_0,y_0) to a second corner (x_1,y_1). For any point (x,y) in the plane, define

E(x,y) = (x-x_0)(y_1-y_0) - (y-y_0)(x_1-x_0).

This is nothing more than a 2-D cross product of the edge vector with the vector from the edge start to the test point. Its sign is all that matters:

E(x,y) > 0 — the point lies on one side of the edge's line;
E(x,y) < 0 — it lies on the other side;
E(x,y) = 0 — it sits exactly on the line.

A triangle is just the region trapped between its three edges. So a point is inside the triangle exactly when it lies on the correct side of all three edges at once — that is, when the three edge functions E_0, E_1, E_2 all share the same sign. If even one of them disagrees, the point has slipped out past an edge and is outside. To fill a triangle, a rasterizer walks over every candidate pixel, evaluates the three edge functions at the pixel's centre (i+0.5,\,j+0.5), and lights the pixel if the signs agree. That's it — three multiplies, three subtracts, three sign checks per pixel.

Watch it fill

Below, one triangle corner is yours to move. Each shaded square is a pixel whose centre passed all three edge tests — precisely the set of pixels the GPU would switch on. Notice how the filled region never quite matches the smooth triangle: pixels are square, so edges come out as little staircases. That jaggedness is aliasing, and taming it is a whole subject of its own.

Worked example: is this pixel inside?

Take the triangle with corners A=(1,1), B=(5,1) and C=(3,5), and test the pixel whose centre is P=(3,2). We evaluate the edge function for each directed edge A\to B, B\to C, C\to A:

E_{AB} = (3-1)(1-1) - (2-1)(5-1) = 0 - 4 = -4,

E_{BC} = (3-5)(5-1) - (2-1)(3-5) = -8 + 2 = -6,

E_{CA} = (3-3)(1-5) - (2-5)(1-3) = 0 - 6 = -6.

All three are negative — the signs agree — so P=(3,2) is inside, and the pixel is filled. Now try a point just outside, Q=(1,4): you'd get E_{AB}=-12, E_{BC}=-10, but E_{CA}=+6. One sign flipped, so Q has crossed edge C\to A and lands outside. A single disagreeing sign is enough to reject a pixel.

Barycentric coverage: free interpolation

The three edge-function values aren't just sign flags — their magnitudes, divided by the triangle's total (twice its area), are the barycentric coordinates (\alpha,\beta,\gamma) of the pixel. For an interior point these are the three weights that always

\alpha + \beta + \gamma = 1, \qquad \alpha,\beta,\gamma \ge 0,

measuring how close the pixel is to each corner. This is a beautiful two-for-one deal: the very same arithmetic that decided whether a pixel is covered also tells us how to blend the three corners' data across the triangle. Colour, texture coordinates, surface normals — any per-vertex value v_A, v_B, v_C is smoothly interpolated as \alpha v_A + \beta v_B + \gamma v_C, for free, at every pixel. A red, green and blue corner melt into a smooth rainbow across the face without any extra work.

The z-buffer: who's in front?

Coverage tells us which pixels a triangle touches — but a scene has thousands of triangles, many stacked in front of one another. When two triangles both cover the same pixel, which one do we see? The nearer one. The z-buffer (depth buffer) solves this with breathtaking simplicity: alongside the colour of every pixel, store its depth — how far that surface is from the camera. Start every pixel's depth at +\infty (infinitely far). Then, as each triangle is rasterized, at every pixel it covers:

Compute the new fragment's depth z_{\text{new}} (interpolated from the corners with those same barycentric weights).
If z_{\text{new}} < z_{\text{stored}}, this fragment is nearer: overwrite both the colour and the stored depth.
Otherwise it's hidden behind something closer — discard it.

Triangles can arrive in any order and the picture still comes out right, because the buffer always remembers the closest surface seen so far. It is essentially a running \min over depth, computed independently at every pixel.

Worked example: a depth compare

Suppose a red triangle and a blue triangle both cover pixel (120, 80). The red fragment there has depth z_{\text{red}} = 0.80; the blue one has z_{\text{blue}} = 0.50 (smaller means nearer the camera). Rasterize red first: the buffer held +\infty, and 0.80 < \infty, so we store colour = red, depth = 0.80. Now blue arrives: 0.50 < 0.80 is true, so blue is nearer — overwrite with colour = blue, depth = 0.50. The final pixel is blue. Had they arrived in the opposite order, red would have failed its test (0.80 < 0.50 is false) and been discarded — same answer either way. The kept depth is simply \min(0.80, 0.50) = 0.50.

Two triangles that share an edge — as they do all over any mesh — both have a claim on the pixels sitting exactly on that boundary, where an edge function is precisely 0. If both triangles draw such a pixel, it gets rasterized twice (wasted work, and visible seams once you blend transparency). If a rounding quirk makes neither claim it, you get a one-pixel crack of background showing through.

The fix is the top-left fill rule: a boundary pixel (edge function exactly zero) is awarded to a triangle only if it lies on that triangle's top or left edge. This half-open convention makes the ownership of every shared edge unambiguous, so each pixel is drawn by exactly one triangle — no gaps, no doubles. It's a tiny rule with an outsized effect on picture quality.

Depth is stored with finite precision, and (because of the perspective divide) that precision is spread very unevenly — most of the bits are spent up close to the camera, leaving far-away surfaces with only a few crumbs of resolution. When two surfaces are almost coplanar and far off, their computed depths can round to the same value, so the z-buffer can't decide which is in front. The result is z-fighting: a shimmering, flickering speckle where the two surfaces swap winner from frame to frame. The classic cures are to pull the camera's near plane out (don't waste precision on nothing), use a smarter depth encoding, or simply not place two walls in the exact same spot.