Ray Tracing
Look at a glossy car in an animated film: the sky curves across its bonnet, the road glints back off
its doors, a soft shadow pools beneath it. None of that was painted by hand. It was
ray traced — computed by literally simulating how light travels. Where
projection
squashes a 3-D world onto the screen with a matrix, ray tracing turns the whole idea inside out: for
every single pixel, it shoots a straight line — a ray — out from the eye,
through that pixel, into the scene, and asks "what's the first thing this ray hits?"
Answer that, colour the pixel with whatever the ray found, and repeat for a few million pixels. Add
one more idea — when the ray hits something shiny, bounce it and keep going — and reflections,
shadows and glass fall out almost for free. That single, honest simulation of light is why films and
offline renderers reach for ray tracing when they want photorealism.
A ray is a point plus a direction
A ray is the simplest object in the whole subject: start at an origin O
(the eye), pick a direction D (through the pixel), and slide forward. Every
point on the ray is
P(t) = O + t\,D, \qquad t \ge 0.
The parameter t is distance along the ray (in units of
|D|). At t=0 you're at the eye; as
t grows you march out into the scene. We only care about
t \ge 0 — things behind the eye don't count. Rendering a whole
image is then just: for each pixel, build its ray, and find the smallest positive
t at which the ray strikes a surface. The smallest
t wins because that's the nearest object — everything behind it is hidden.
Ray meets sphere: it all comes down to a quadratic
Spheres are the "hello world" of ray tracing, because the intersection maths is exact and clean. A
sphere is every point P at distance r from a
centre C: |P - C|^2 = r^2. Substitute the ray
P(t)=O+tD and expand. Writing m = O - C:
|O + tD - C|^2 = r^2 \;\Longrightarrow\; (D\cdot D)\,t^2 + 2(D\cdot m)\,t + (m\cdot m - r^2) = 0.
That's just a quadratic a t^2 + b t + c = 0 in the unknown distance
t, with
a = D\cdot D, \qquad b = 2\,(D\cdot m), \qquad c = m\cdot m - r^2.
And whether the ray hits at all is decided entirely by the discriminant
\Delta = b^2 - 4ac, exactly as in school algebra:
- \Delta < 0 — no real roots: the ray misses the
sphere entirely.
- \Delta = 0 — one repeated root: the ray just grazes
the sphere (tangent).
- \Delta > 0 — two roots
t = \dfrac{-b \pm \sqrt{\Delta}}{2a}: the ray pierces the sphere,
entering and leaving. The nearer hit takes the - sign
(the smaller t).
Worked example: hit or miss?
Put the eye at O=(0,0), fire straight along
D=(1,0), and place a unit sphere (r=1) at
C=(3,0). Then m = O - C = (-3, 0), and
a = D\cdot D = 1, \quad b = 2(D\cdot m) = 2(-3) = -6, \quad c = m\cdot m - r^2 = 9 - 1 = 8.
\Delta = b^2 - 4ac = 36 - 32 = 4 > 0 \;\Rightarrow\; \text{HIT}.
Two roots: t = \frac{6 \pm 2}{2} = 4 or 2. The
nearer one is t=2, so the ray strikes the front of the sphere at
P(2) = (2,0) — dead on, exactly one radius in front of the centre. Now
nudge the aim upward to D=(1,2): recomputing gives
a=5, b=-6, c=8, so
\Delta = 36 - 160 = -124 < 0 — a miss, the ray sails past
above the sphere. The sign of one number, the discriminant, is the whole yes/no.
Watch a ray bounce
The eye sits on the left; the vertical line is the image plane, one point per pixel. Drag the pixel up
and down to aim the primary ray. When it strikes the sphere, you'll see the surface
normal (the outward direction at the hit) and the reflected ray springing
off it. Aim high or low enough and the ray misses completely — the discriminant has gone negative.
The bounce: reflecting about the normal
When the ray lands, the surface has an outward unit normal
N — for a sphere, simply the direction from the centre to the hit,
N = (P - C)/r. The law of reflection says the bounced ray
leaves at the same angle it arrived, measured from the normal: angle in equals angle out. In
vectors, the incoming direction D reflects to
R = D - 2\,(D\cdot N)\,N.
The term (D\cdot N)N is the part of D pointing
along the normal; subtracting it twice flips that component while leaving the sideways part
untouched — precisely a mirror bounce. Spawn a fresh ray from the hit point in direction
R, trace it, and you've captured a reflection. Do this
recursively — reflections of reflections — and mirrored hallways and chrome spheres render
themselves. This recursion is what puts the "tracing" in ray tracing.
Here's a trap everyone hits. You find an intersection at point P, then
shoot a new ray from P — a shadow ray toward a light, or a reflection ray.
But floating-point rounding means P is a hair inside the very
surface it came from. The new ray, starting at t=0, immediately reports an
intersection with that same surface at some microscopic t — so the surface
decides it is shadowing itself. The screen fills with a grimy black speckle called
shadow acne.
The standard cure is to start each secondary ray a tiny epsilon off the surface —
push the origin a smidgen along the normal, or ignore any hit with t < \varepsilon
for some small \varepsilon. One nudge, and the acne clears up.
Cost. The naive algorithm tests every ray against every object, so a frame is on the
order of (\text{pixels}) \times (\text{objects}) intersection tests — and
with reflection and shadow rays it multiplies again. A 4K frame is ~8 million primary rays before a
single bounce. That's why real-time engines long preferred
rasterization, which
touches each triangle once. The modern rescue is acceleration structures (like
bounding-volume hierarchies) that skip whole swathes of a scene at once, cutting the cost from linear
in the object count to roughly logarithmic — plus dedicated ray-tracing hardware now baked into GPUs.
Films, which render offline, have always been happy to pay the full price for the realism.