Directional Derivatives
A
partial derivative
answers a narrow question: how fast does f(x, y) change if you
walk due east (f_x) or due north
(f_y)? But you can set off in any direction. The
directional derivative is the rate of change of f
as you step off a point along a chosen unit direction
\mathbf{u} = (u_1, u_2):
D_{\mathbf{u}} f(a, b) = \lim_{t \to 0} \frac{f(a + t u_1,\; b + t u_2) - f(a, b)}{t}.
The two partials are just the special cases \mathbf{u} = (1, 0)
and \mathbf{u} = (0, 1). The headline is that you never need this
limit again: every directional derivative is a tidy combination of the two partials.
Deriving the formula from the chain rule
Walking along the direction \mathbf{u} traces a straight line.
Restricting f to that line turns a two-variable function into an
ordinary one-variable function of the step length t — and an
ordinary derivative is exactly what
the multivariable chain rule
knows how to compute.
Step 1 — parametrise the line. Start at (a, b)
and move with velocity \mathbf{u}. The position after a step
t is
x(t) = a + t u_1, \qquad y(t) = b + t u_2.
Step 2 — define the one-variable restriction. Let
g be f seen only along that line:
g(t) = f\big(x(t),\, y(t)\big).
By the very definition above, the directional derivative is the ordinary derivative of
g at the start: D_{\mathbf{u}} f(a, b) = g'(0).
Step 3 — differentiate with the chain rule. Since
g is f composed with the path
(x(t), y(t)), the chain rule gives
g'(t) = f_x\big(x(t), y(t)\big)\, \frac{dx}{dt} + f_y\big(x(t), y(t)\big)\, \frac{dy}{dt}.
Step 4 — read off the velocities. Differentiating the parametrisation in
Step 1, the line moves at constant velocity:
\frac{dx}{dt} = u_1, \qquad \frac{dy}{dt} = u_2.
Step 5 — evaluate at the start t = 0, where
(x, y) = (a, b):
D_{\mathbf{u}} f(a, b) = g'(0) = f_x(a, b)\, u_1 + f_y(a, b)\, u_2.
Step 6 — recognise a dot product. That sum is exactly the dot product of
the vector of partials with the direction. Collecting the partials into the
gradient \nabla f = (f_x, f_y),
D_{\mathbf{u}} f = \nabla f \cdot \mathbf{u} = f_x\, u_1 + f_y\, u_2.
No limit, no fuss: two partials and a dot product give the slope in any direction.
Let f be differentiable at (a, b) and
let \mathbf{u} = (u_1, u_2) be a unit vector
(\|\mathbf{u}\| = 1). Then the directional derivative of
f at (a, b) in the direction
\mathbf{u} exists and equals the gradient dotted with the
direction:
D_{\mathbf{u}} f(a, b) = \nabla f(a, b) \cdot \mathbf{u} = f_x(a, b)\, u_1 + f_y(a, b)\, u_2.
In particular D_{(1,0)} f = f_x and
D_{(0,1)} f = f_y: the partials are directional derivatives along
the axes.
A worked example
Take f(x, y) = x^2 + xy at the point
(1, 2), heading in the direction that makes
45^\circ with the east axis.
Step 1 — the partials.
f_x = 2x + y and f_y = x, so at
(1, 2) the gradient is
\nabla f(1, 2) = (2\cdot 1 + 2,\; 1) = (4, 1).
Step 2 — a unit direction. The 45^\circ heading
is (\cos 45^\circ, \sin 45^\circ) = \left(\tfrac{1}{\sqrt 2}, \tfrac{1}{\sqrt 2}\right),
which already has length 1.
Step 3 — dot them together.
D_{\mathbf{u}} f(1, 2) = (4, 1) \cdot \left(\tfrac{1}{\sqrt 2}, \tfrac{1}{\sqrt 2}\right) = \frac{4}{\sqrt 2} + \frac{1}{\sqrt 2} = \frac{5}{\sqrt 2} \approx 3.54.
So f climbs at about 3.54 units of
height per unit of horizontal travel in that direction.
The formula D_{\mathbf{u}} f = \nabla f \cdot \mathbf{u} is
only a genuine rate when \mathbf{u} has length
1. Look back at Step 1 of the derivation: the parametrisation
(a + t u_1,\, b + t u_2) covers a distance
\|\mathbf{u}\|\, t in space as the parameter advances by
t. If \|\mathbf{u}\| \neq 1, then a
step of t = 1 in the parameter is not a step of one unit of
distance, and you are measuring "rise per parameter" rather than "rise per metre
walked".
Concretely, scaling the direction scales the answer:
D_{c\mathbf{u}} f = \nabla f \cdot (c\mathbf{u}) = c\, (\nabla f \cdot \mathbf{u}).
Doubling the vector doubles the "derivative" without any change to the surface — clearly
not a property of the terrain itself. Insisting on
\|\mathbf{u}\| = 1 pins down the one honest answer: the slope
you actually feel underfoot. If someone hands you a non-unit direction
\mathbf{v}, normalise it first:
\mathbf{u} = \mathbf{v} / \|\mathbf{v}\|.
See the slope turn with the direction
The faint rings are level curves of
f(x, y) = x^2 + xy — points of equal height. At the marked point
the gradient is \nabla f = (2x + y,\, x). Swing the unit arrow
\mathbf{u} with the slider and read
D_{\mathbf{u}} f = \nabla f \cdot \mathbf{u}: it peaks when the
arrow lines up with the steepest uphill, vanishes when the arrow runs along a level
ring, and goes negative pointing downhill.