The Multivariable Chain Rule

The one-variable chain rule differentiates a composition by multiplying rates. In several variables a quantity can depend on time through several changing inputs at once, and each contributes its own rate. Suppose z = f(x, y) while x = x(t) and y = y(t) both ride along a parameter t. Then z is ultimately a function of t alone, and its rate of change is

\frac{dz}{dt} = \frac{\partial f}{\partial x}\,\frac{dx}{dt} + \frac{\partial f}{\partial y}\,\frac{dy}{dt}.

Two channels, each a "(slope of z in that direction) \times (speed of that input)", summed. The slogan is: when several roads lead from t to z, add the contributions of every road.

Deriving it from the total differential

The cleanest derivation reuses the total differential.

Step 1 — start from the total differential. A small change in z = f(x, y) from small changes in its inputs is, to first order,

dz = \frac{\partial f}{\partial x}\, dx + \frac{\partial f}{\partial y}\, dy.

Step 2 — divide through by dt. Both x and y change only because t does, so divide the whole relation by the time increment:

\frac{dz}{dt} = \frac{\partial f}{\partial x}\,\frac{dx}{dt} + \frac{\partial f}{\partial y}\,\frac{dy}{dt}.

That is the rule — a single line, falling straight out of the linear approximation. Now a worked example.

A worked example, step by step

Let z = f(x, y) = x^2 + y^2 with the inputs running around the unit circle, x = \cos t and y = \sin t. Find dz/dt.

Step 1 — the partials of f:

\frac{\partial f}{\partial x} = 2x, \qquad \frac{\partial f}{\partial y} = 2y.

Step 2 — the input speeds:

\frac{dx}{dt} = -\sin t, \qquad \frac{dy}{dt} = \cos t.

Step 3 — assemble the chain rule:

\frac{dz}{dt} = 2x\,(-\sin t) + 2y\,(\cos t).

Step 4 — substitute the paths x = \cos t, y = \sin t:

\frac{dz}{dt} = 2\cos t\,(-\sin t) + 2\sin t\,(\cos t) = -2\sin t\cos t + 2\sin t\cos t = 0.

Step 5 — sanity check. On the unit circle x^2 + y^2 = 1 always, so z \equiv 1 is constant and of course dz/dt = 0. Geometrically the path runs along a level curve of f, so the height never changes — the two channels cancel exactly. Drag t below to send a dot around a path on the contour map and watch z(t) respond.

The chain rule quietly explains where the implicit-differentiation formula comes from. Suppose a curve is defined implicitly by F(x, y) = 0 — for instance the circle F = x^2 + y^2 - 1 = 0 — and we want dy/dx along it. Think of y as a function of x and differentiate the equation F(x, y(x)) = 0 with respect to x, using the chain rule on the left:

\frac{\partial F}{\partial x}\,\frac{dx}{dx} + \frac{\partial F}{\partial y}\,\frac{dy}{dx} = 0 \quad\Longrightarrow\quad F_x + F_y\,\frac{dy}{dx} = 0.

Solving for the slope gives the compact implicit function formula

\frac{dy}{dx} = -\frac{F_x}{F_y}.

For the circle F_x = 2x, F_y = 2y, so dy/dx = -x/y — the same answer the high-school "differentiate both sides" trick produces, now revealed as a one-line corollary of the multivariable chain rule (valid wherever F_y \ne 0).