Deriving the Maximum Principle

The conditions on the previous page were stated, not proved. Here we derive them with the calculus of variations — the same machine that produced the Euler–Lagrange equation: perturb the optimum, demand the first variation vanish, and read off what that forces. Every condition — the costate equation, the stationarity in u, and the transversality condition — falls out of one integration by parts.

The derivation, step by step

Step 1 — form the augmented cost. Adjoin the dynamics with the costate \lambda(t) and bundle L + \lambda^\top f into the Hamiltonian H, leaving the lone -\lambda^\top \dot{x} term outside:

\bar{J} = \varphi\big(x(T)\big) + \int_0^T \Big[\, H(x, u, \lambda, t) - \lambda^\top \dot{x} \,\Big]\,dt.

On any trajectory obeying the dynamics, \bar{J} = J, so minimising one minimises the other.

Step 2 — perturb the optimum. Let (x, u) be the optimal pair and nudge both by a small multiple of admissible variations:

u \to u + \varepsilon\,\delta u, \qquad x \to x + \varepsilon\,\delta x, \qquad \delta x(0) = 0.

The initial state is fixed, so its variation vanishes, \delta x(0) = 0; the terminal state is free, so \delta x(T) is unrestricted. The costate \lambda is a multiplier we are free to choose, so we do not vary it here.

Step 3 — take the first variation. Differentiate \bar{J} with respect to \varepsilon at \varepsilon = 0 (the chain rule on H and on \varphi). At an optimum this must vanish:

\delta\bar{J} = \frac{\partial \varphi}{\partial x}^{\!\top}\!\delta x(T) + \int_0^T \left[\, \frac{\partial H}{\partial x}^{\!\top}\!\delta x + \frac{\partial H}{\partial u}^{\!\top}\!\delta u - \lambda^\top \delta\dot{x} \,\right]dt = 0.

Step 4 — integrate the \lambda^\top \delta\dot{x} term by parts. The variation \delta\dot{x} carries the derivative; integration by parts moves it off \delta x and onto the costate \lambda, producing a boundary term plus an integral in plain \delta x:

\int_0^T \lambda^\top \delta\dot{x}\,dt = \Big[\, \lambda^\top \delta x \,\Big]_0^T - \int_0^T \dot{\lambda}^\top \delta x\,dt.

With \delta x(0) = 0 the lower limit drops, so [\lambda^\top \delta x]_0^T = \lambda(T)^\top \delta x(T). Hence

-\int_0^T \lambda^\top \delta\dot{x}\,dt = -\lambda(T)^\top \delta x(T) + \int_0^T \dot{\lambda}^\top \delta x\,dt.

Step 5 — collect like terms. Substitute back and gather the coefficients of \delta x (inside the integral), \delta u, and the terminal \delta x(T) (outside it):

\delta\bar{J} = \int_0^T \left[ \left(\frac{\partial H}{\partial x} + \dot{\lambda}\right)^{\!\top}\!\delta x + \frac{\partial H}{\partial u}^{\!\top}\!\delta u \right]dt + \left(\frac{\partial \varphi}{\partial x} - \lambda(T)\right)^{\!\top}\!\delta x(T) = 0.

Step 6 — make each coefficient vanish independently. The variations \delta x, \delta u and \delta x(T) are arbitrary and independent, so (by the fundamental lemma of the calculus of variations) each bracket must vanish on its own:

The fourth condition, the state equation \dot{x} = \partial H/\partial\lambda = f, comes from the same principle applied to a variation in \lambda — which simply re-imposes the dynamics. All four conditions of the maximum principle are now derived.

The exact echo of Euler–Lagrange

Every move here has a twin in the Euler–Lagrange derivation. There the perturbation was \eta(x) pinned at both ends; here it is \delta x(t) pinned only at the start. There a single integration by parts moved the derivative off \eta' and the pinned-endpoint boundary term vanished; here the same integration by parts moves the derivative off \delta\dot{x}, and because the terminal endpoint is free the boundary term survives — and that survivor is exactly the transversality condition. The maximum principle is the Euler–Lagrange equation generalised to a system with a control input and an open end.

See the first variation vanish

Take the minimum-effort example J = \int_0^1 u^2\,dt, \dot{x} = u, from x(0) = 0 to x(1) = 1, whose optimum is the straight line x^\*(t) = t. The slider adds a perturbation \varepsilon\,\delta x with \delta x = \sin(\pi t) (so \delta x(0) = 0). The control becomes u = \dot{x} = 1 + \varepsilon\pi\cos(\pi t), and the cost works out to J(\varepsilon) = 1 + \tfrac{\pi^2}{2}\varepsilon^2 — minimal and flat at \varepsilon = 0. That flatness, in every direction \delta x, is \delta\bar{J} = 0.

The single sign that distinguishes a boundary condition from nothing is whether \delta x(T) is free. Pin the terminal state (x(T) prescribed) and \delta x(T) = 0 kills the boundary term, leaving \lambda(T) to be fixed instead by the state constraint — exactly the minimum-effort example, where \lambda(T) = -2x_f/T was set by x(T) = x_f. Leave it free and the boundary term must vanish on its own, giving \lambda(T) = \partial\varphi/\partial x. Same algebra, two endpoints — the transversality condition simply records which.