The Maximum Principle
With the Hamiltonian
H = L + \lambda^\top f in hand, Pontryagin's
maximum principle states the conditions an optimal trajectory must satisfy.
It converts "minimise J over all admissible controls" — an
infinite-dimensional search — into a set of differential equations and a pointwise
minimisation of H. This page states those conditions and works the
smallest possible example; the next page derives them.
Pontryagin's necessary conditions
Suppose (x^\*, u^\*) minimises
J = \varphi(x(T)) + \int_0^T L\,dt subject to
\dot{x} = f(x, u, t), x(0) = x_0, with the
terminal state free. Then there is a costate \lambda(t) for which
the optimal pair satisfies all of the following.
-
State equation — recovers the dynamics:
\dot{x} = \frac{\partial H}{\partial \lambda} = f(x, u, t), \qquad x(0) = x_0.
-
Costate equation — runs backward in time:
\dot{\lambda} = -\frac{\partial H}{\partial x}.
-
Minimum condition — at each instant the optimal control
globally minimises the Hamiltonian over the admissible set
\mathcal{U}:
u^\*(t) = \arg\min_{u \in \mathcal{U}} H\big(x^\*, u, \lambda, t\big).
When the minimiser is interior and H is smooth, this reduces to
\partial H/\partial u = 0.
-
Transversality — fixes the costate at the free end:
\lambda(T) = \frac{\partial \varphi}{\partial x}\big(x(T)\big).
The name is a quirk of sign convention. Pontryagin defined his Hamiltonian with the opposite
sign (\mathcal{H} = -L + \lambda^\top f = -H), so minimising our
H is the same as maximising his
\mathcal{H} — hence "maximum principle". The condition is the same
either way: pick the control that makes the Hamiltonian most favourable at every instant.
A two-point boundary value problem
Look at where the data lives. The state condition is given at the
start: x(0) = x_0. The costate condition is given
at the end: \lambda(T) = \partial\varphi/\partial x.
The two ODEs are coupled — \dot{x} needs
u^\*, which the minimum condition reads off from
\lambda, while \dot{\lambda} needs
x:
\begin{aligned} \dot{x} &= \tfrac{\partial H}{\partial \lambda}, & x(0) &= x_0, \\ \dot{\lambda} &= -\tfrac{\partial H}{\partial x}, & \lambda(T) &= \tfrac{\partial \varphi}{\partial x}. \end{aligned}
Because half the conditions sit at t = 0 and half at
t = T, this is a two-point boundary value problem,
not a plain initial-value problem you can integrate straight through. The state flows forward
from x_0; the costate flows backward from
\lambda(T); and they must be consistent in between. Solving the two
together is the computational heart of optimal control.
The smallest worked example
Minimise the control effort
J = \int_0^T u^2\,dt for the scalar system
\dot{x} = u driven from x(0) = 0 to a
fixed target x(T) = x_f. Here
L = u^2, f = u, and there is no terminal
cost.
Step 1 — write the Hamiltonian.
H = L + \lambda f = u^2 + \lambda u.
Step 2 — minimum condition. H is a smooth
function of u, so set \partial H/\partial u = 0:
\frac{\partial H}{\partial u} = 2u + \lambda = 0 \quad\Longrightarrow\quad u = -\tfrac12 \lambda.
Step 3 — costate equation. H contains no bare
x, so \partial H/\partial x = 0 and
\dot{\lambda} = -\frac{\partial H}{\partial x} = 0 \quad\Longrightarrow\quad \lambda = \text{const}.
Step 4 — therefore the control is constant. A constant
\lambda makes u = -\tfrac12\lambda
constant too. The state then grows at a constant rate:
\dot{x} = u = \text{const} \quad\Longrightarrow\quad x(t) = u\,t.
Step 5 — fit the boundary condition. Require
x(T) = u\,T = x_f, so
u = \frac{x_f}{T}, \qquad x(t) = \frac{x_f}{T}\,t, \qquad \lambda = -2u = -\frac{2 x_f}{T}.
The minimum-effort path to the target is a straight line travelled at
constant speed — exactly what intuition expects, and a reassuring first check of the machine.
Here the endpoint is pinned, so \lambda(T) is set by the boundary
condition x(T) = x_f rather than by transversality; the free-endpoint
case would instead impose \lambda(T) = \partial\varphi/\partial x.
State and costate, side by side
Below are the optimal x(t) and
\lambda(t) for the example. Slide the target
x_f and the horizon T: the state rises in
a straight line to its target, and the costate sits at the constant value
-2 x_f / T. Reaching further or moving faster steepens the line and
pushes the costate further from zero — a bigger shadow price for a harder transfer.
These four conditions are not assertions to memorise — each one falls out of making the
augmented cost \bar{J} = \varphi + \int(H - \lambda^\top \dot{x})\,dt
stationary, exactly as the
Euler–Lagrange
equation falls out of making a functional stationary. The state equation is
stationarity in \lambda, the costate equation is stationarity in
x (after an integration by parts), the minimum condition is
stationarity in u, and the transversality condition is the leftover
boundary term. The next page carries out that derivation in full.