Directional Derivatives

A partial derivative answers a narrow question: how fast does f(x, y) change if you walk due east (f_x) or due north (f_y)? But you can set off in any direction. The directional derivative is the rate of change of f as you step off a point along a chosen unit direction \mathbf{u} = (u_1, u_2):

D_{\mathbf{u}} f(a, b) = \lim_{t \to 0} \frac{f(a + t u_1,\; b + t u_2) - f(a, b)}{t}.

The two partials are just the special cases \mathbf{u} = (1, 0) and \mathbf{u} = (0, 1). The headline is that you never need this limit again: every directional derivative is a tidy combination of the two partials.

Deriving the formula from the chain rule

Walking along the direction \mathbf{u} traces a straight line. Restricting f to that line turns a two-variable function into an ordinary one-variable function of the step length t — and an ordinary derivative is exactly what the multivariable chain rule knows how to compute.

Step 1 — parametrise the line. Start at (a, b) and move with velocity \mathbf{u}. The position after a step t is

x(t) = a + t u_1, \qquad y(t) = b + t u_2.

Step 2 — define the one-variable restriction. Let g be f seen only along that line:

g(t) = f\big(x(t),\, y(t)\big).

By the very definition above, the directional derivative is the ordinary derivative of g at the start: D_{\mathbf{u}} f(a, b) = g'(0).

Step 3 — differentiate with the chain rule. Since g is f composed with the path (x(t), y(t)), the chain rule gives

g'(t) = f_x\big(x(t), y(t)\big)\, \frac{dx}{dt} + f_y\big(x(t), y(t)\big)\, \frac{dy}{dt}.

Step 4 — read off the velocities. Differentiating the parametrisation in Step 1, the line moves at constant velocity:

\frac{dx}{dt} = u_1, \qquad \frac{dy}{dt} = u_2.

Step 5 — evaluate at the start t = 0, where (x, y) = (a, b):

D_{\mathbf{u}} f(a, b) = g'(0) = f_x(a, b)\, u_1 + f_y(a, b)\, u_2.

Step 6 — recognise a dot product. That sum is exactly the dot product of the vector of partials with the direction. Collecting the partials into the gradient \nabla f = (f_x, f_y),

D_{\mathbf{u}} f = \nabla f \cdot \mathbf{u} = f_x\, u_1 + f_y\, u_2.

No limit, no fuss: two partials and a dot product give the slope in any direction.

Let f be differentiable at (a, b) and let \mathbf{u} = (u_1, u_2) be a unit vector (\|\mathbf{u}\| = 1). Then the directional derivative of f at (a, b) in the direction \mathbf{u} exists and equals the gradient dotted with the direction: D_{\mathbf{u}} f(a, b) = \nabla f(a, b) \cdot \mathbf{u} = f_x(a, b)\, u_1 + f_y(a, b)\, u_2. In particular D_{(1,0)} f = f_x and D_{(0,1)} f = f_y: the partials are directional derivatives along the axes.

A worked example

Take f(x, y) = x^2 + xy at the point (1, 2), heading in the direction that makes 45^\circ with the east axis.

Step 1 — the partials. f_x = 2x + y and f_y = x, so at (1, 2) the gradient is \nabla f(1, 2) = (2\cdot 1 + 2,\; 1) = (4, 1).

Step 2 — a unit direction. The 45^\circ heading is (\cos 45^\circ, \sin 45^\circ) = \left(\tfrac{1}{\sqrt 2}, \tfrac{1}{\sqrt 2}\right), which already has length 1.

Step 3 — dot them together.

D_{\mathbf{u}} f(1, 2) = (4, 1) \cdot \left(\tfrac{1}{\sqrt 2}, \tfrac{1}{\sqrt 2}\right) = \frac{4}{\sqrt 2} + \frac{1}{\sqrt 2} = \frac{5}{\sqrt 2} \approx 3.54.

So f climbs at about 3.54 units of height per unit of horizontal travel in that direction.

The formula D_{\mathbf{u}} f = \nabla f \cdot \mathbf{u} is only a genuine rate when \mathbf{u} has length 1. Look back at Step 1 of the derivation: the parametrisation (a + t u_1,\, b + t u_2) covers a distance \|\mathbf{u}\|\, t in space as the parameter advances by t. If \|\mathbf{u}\| \neq 1, then a step of t = 1 in the parameter is not a step of one unit of distance, and you are measuring "rise per parameter" rather than "rise per metre walked".

Concretely, scaling the direction scales the answer: D_{c\mathbf{u}} f = \nabla f \cdot (c\mathbf{u}) = c\, (\nabla f \cdot \mathbf{u}). Doubling the vector doubles the "derivative" without any change to the surface — clearly not a property of the terrain itself. Insisting on \|\mathbf{u}\| = 1 pins down the one honest answer: the slope you actually feel underfoot. If someone hands you a non-unit direction \mathbf{v}, normalise it first: \mathbf{u} = \mathbf{v} / \|\mathbf{v}\|.

See the slope turn with the direction

The faint rings are level curves of f(x, y) = x^2 + xy — points of equal height. At the marked point the gradient is \nabla f = (2x + y,\, x). Swing the unit arrow \mathbf{u} with the slider and read D_{\mathbf{u}} f = \nabla f \cdot \mathbf{u}: it peaks when the arrow lines up with the steepest uphill, vanishes when the arrow runs along a level ring, and goes negative pointing downhill.