Multivariable Optimization

In one variable, hills and valleys sit where the derivative vanishes. The same instinct works in two variables — but with a twist. At a peak or a pit of a surface, the ground is momentarily flat in every direction, so every directional derivative is zero. By the gradient identity D_{\mathbf{u}} f = \nabla f \cdot \mathbf{u}, that forces the whole gradient to vanish:

\nabla f = (f_x,\, f_y) = (0,\, 0).

A point where \nabla f = \mathbf{0} is a critical point — the multivariable cousin of a critical point in single-variable calculus. The twist is the new third option: besides a local max or min, the surface can do something a curve never can — rise one way and fall the other, forming a saddle.

The second-derivative test

Finding \nabla f = \mathbf{0} locates the candidates; classifying them needs the second derivatives. They assemble into the Hessian, the matrix of second partials, and the test hinges on its determinant, the discriminant:

D = \det \begin{pmatrix} f_{xx} & f_{xy} \\ f_{xy} & f_{yy} \end{pmatrix} = f_{xx}\, f_{yy} - f_{xy}^{\,2}.

At a critical point, the sign of D says whether the surface curves the same way in every direction (a genuine extremum) or in opposite ways along two directions (a saddle); then the sign of f_{xx} tells max from min.

Let (a, b) be a critical point of a twice-differentiable f (so \nabla f(a, b) = \mathbf{0}), and write D = f_{xx} f_{yy} - f_{xy}^{\,2} evaluated at (a, b). Then:

A worked classification

Classify every critical point of f(x, y) = x^3 - 3x + y^2.

Step 1 — the gradient. Differentiate in each variable:

f_x = 3x^2 - 3, \qquad f_y = 2y.

Step 2 — solve \nabla f = \mathbf{0}. Setting both to zero: 3x^2 - 3 = 0 \Rightarrow x = \pm 1, and 2y = 0 \Rightarrow y = 0. Two critical points:

(1, 0) \quad\text{and}\quad (-1, 0).

Step 3 — the second partials. Differentiate again:

f_{xx} = 6x, \qquad f_{yy} = 2, \qquad f_{xy} = 0.

Step 4 — the discriminant. D = f_{xx} f_{yy} - f_{xy}^{\,2} = (6x)(2) - 0^2 = 12x.

Step 5 — classify (1, 0). Here D = 12(1) = 12 > 0 and f_{xx} = 6(1) = 6 > 0 — so (1, 0) is a local minimum.

Step 6 — classify (-1, 0). Here D = 12(-1) = -12 < 0 — so (-1, 0) is a saddle point, regardless of f_{xx}. Along the x-axis the cubic x^3 - 3x peaks there, while along y the term y^2 climbs — up one way, down the other.

A saddle is the genuinely two-dimensional phenomenon: a mountain pass, high relative to the valleys on either side yet low relative to the peaks fore and aft. A one-variable function can't manage it — there's only one axis to disagree along. It is exactly why \nabla f = \mathbf{0} is necessary but not sufficient for an extremum.

The test above is local. To find the global maximum or minimum of f on a closed, bounded region you must check two kinds of candidate and compare their heights:

A continuous function on a closed bounded region attains its global max and min (the extreme value theorem), so the winner is simply the largest and smallest height among all the candidates — interior and boundary alike.

Three kinds of flat point, one map

Here is the separable surface f(x, y) = (x^3 - 3x) + (y^3 - 3y), whose contours are drawn as faint rings. It has four critical points — a local min, a local max, and two saddles. Pick a critical point with the buttons: the marker jumps to it, and the panel reports its discriminant D = f_{xx} f_{yy} - f_{xy}^2 and the verdict from the test. Notice the contours close into loops around the min and max, but cross in an X at each saddle.