The Maximum Principle

With the Hamiltonian H = L + \lambda^\top f in hand, Pontryagin's maximum principle states the conditions an optimal trajectory must satisfy. It converts "minimise J over all admissible controls" — an infinite-dimensional search — into a set of differential equations and a pointwise minimisation of H. This page states those conditions and works the smallest possible example; the next page derives them.

Pontryagin's necessary conditions

Suppose (x^\*, u^\*) minimises J = \varphi(x(T)) + \int_0^T L\,dt subject to \dot{x} = f(x, u, t), x(0) = x_0, with the terminal state free. Then there is a costate \lambda(t) for which the optimal pair satisfies all of the following.

The name is a quirk of sign convention. Pontryagin defined his Hamiltonian with the opposite sign (\mathcal{H} = -L + \lambda^\top f = -H), so minimising our H is the same as maximising his \mathcal{H} — hence "maximum principle". The condition is the same either way: pick the control that makes the Hamiltonian most favourable at every instant.

A two-point boundary value problem

Look at where the data lives. The state condition is given at the start: x(0) = x_0. The costate condition is given at the end: \lambda(T) = \partial\varphi/\partial x. The two ODEs are coupled — \dot{x} needs u^\*, which the minimum condition reads off from \lambda, while \dot{\lambda} needs x:

\begin{aligned} \dot{x} &= \tfrac{\partial H}{\partial \lambda}, & x(0) &= x_0, \\ \dot{\lambda} &= -\tfrac{\partial H}{\partial x}, & \lambda(T) &= \tfrac{\partial \varphi}{\partial x}. \end{aligned}

Because half the conditions sit at t = 0 and half at t = T, this is a two-point boundary value problem, not a plain initial-value problem you can integrate straight through. The state flows forward from x_0; the costate flows backward from \lambda(T); and they must be consistent in between. Solving the two together is the computational heart of optimal control.

The smallest worked example

Minimise the control effort J = \int_0^T u^2\,dt for the scalar system \dot{x} = u driven from x(0) = 0 to a fixed target x(T) = x_f. Here L = u^2, f = u, and there is no terminal cost.

Step 1 — write the Hamiltonian.

H = L + \lambda f = u^2 + \lambda u.

Step 2 — minimum condition. H is a smooth function of u, so set \partial H/\partial u = 0:

\frac{\partial H}{\partial u} = 2u + \lambda = 0 \quad\Longrightarrow\quad u = -\tfrac12 \lambda.

Step 3 — costate equation. H contains no bare x, so \partial H/\partial x = 0 and

\dot{\lambda} = -\frac{\partial H}{\partial x} = 0 \quad\Longrightarrow\quad \lambda = \text{const}.

Step 4 — therefore the control is constant. A constant \lambda makes u = -\tfrac12\lambda constant too. The state then grows at a constant rate:

\dot{x} = u = \text{const} \quad\Longrightarrow\quad x(t) = u\,t.

Step 5 — fit the boundary condition. Require x(T) = u\,T = x_f, so

u = \frac{x_f}{T}, \qquad x(t) = \frac{x_f}{T}\,t, \qquad \lambda = -2u = -\frac{2 x_f}{T}.

The minimum-effort path to the target is a straight line travelled at constant speed — exactly what intuition expects, and a reassuring first check of the machine. Here the endpoint is pinned, so \lambda(T) is set by the boundary condition x(T) = x_f rather than by transversality; the free-endpoint case would instead impose \lambda(T) = \partial\varphi/\partial x.

State and costate, side by side

Below are the optimal x(t) and \lambda(t) for the example. Slide the target x_f and the horizon T: the state rises in a straight line to its target, and the costate sits at the constant value -2 x_f / T. Reaching further or moving faster steepens the line and pushes the costate further from zero — a bigger shadow price for a harder transfer.

These four conditions are not assertions to memorise — each one falls out of making the augmented cost \bar{J} = \varphi + \int(H - \lambda^\top \dot{x})\,dt stationary, exactly as the Euler–Lagrange equation falls out of making a functional stationary. The state equation is stationarity in \lambda, the costate equation is stationarity in x (after an integration by parts), the minimum condition is stationarity in u, and the transversality condition is the leftover boundary term. The next page carries out that derivation in full.