The bridge: \lambda(t) = \nabla_x V(x^\*(t), t)
The claim is that, evaluated along the optimal trajectory, the maximum principle's costate is
exactly the gradient of the HJB value function. Let us define it that way and watch the costate
equation fall out of HJB.
Step 1 — define the costate as the value gradient. Along the optimal path
x^\*(t), set
\lambda(t) := \nabla_x V\big(x^\*(t), t\big).
Step 2 — differentiate it in time. By the chain rule, with
\dot{x}^\* = f,
\dot{\lambda} = \nabla_x V_t + \big(\nabla_x^2 V\big)\,\dot{x}^\* = \nabla_x V_t + \big(\nabla_x^2 V\big) f,
where \nabla_x^2 V is the Hessian of the value function.
Step 3 — differentiate HJB in x. Take
-V_t = H(x, u^\*, \nabla_x V) and apply
\nabla_x. Because u^\* is the minimiser
of H, the term through \partial H/\partial u
vanishes (it is zero at the minimum — the envelope theorem), leaving the explicit
x-dependence and the dependence through
\lambda = \nabla_x V:
-\nabla_x V_t = \frac{\partial H}{\partial x} + \big(\nabla_x^2 V\big)\frac{\partial H}{\partial \lambda}.
Step 4 — use \partial H/\partial \lambda = f. The
state equation says \partial H/\partial\lambda = f, so
-\nabla_x V_t = \frac{\partial H}{\partial x} + \big(\nabla_x^2 V\big) f.
Step 5 — combine. Substitute
\nabla_x V_t = -\partial H/\partial x - (\nabla_x^2 V) f into Step 2;
the Hessian terms cancel:
\dot{\lambda} = \Big(-\frac{\partial H}{\partial x} - (\nabla_x^2 V) f\Big) + (\nabla_x^2 V) f = -\frac{\partial H}{\partial x}.
This is precisely the costate equation of the maximum principle. Pontryagin's
adjoint dynamics are HJB differentiated along the optimal path — the costate
\lambda(t) is the gradient
\nabla_x V riding along the trajectory, and its "shadow price"
meaning is now literal: it is how the optimal cost changes as the state is nudged.
The costate riding the value landscape
Take the worked HJB example again: \dot{x} = u,
\int(x^2 + u^2)\,dt, with
V(x) = x^2 and feedback u^\* = -x. The
optimal trajectory is x^\*(t) = x_0 e^{-t}, gliding to the origin.
Slide time t to send the marker down the value
landscape V(x) = x^2; the tangent drawn at it has slope
V'(x) = 2x, and that slope is the costate
\lambda(t) = 2x^\*(t). One Pontryagin trajectory, read straight off
the gradient of the dynamic-programming value function.
Think of V(x, t) as a landscape over all states and times. HJB
surveys the whole landscape and reads the steepest-descent feedback at every point. The
maximum principle instead follows the single streamline that the optimal start launches —
and along that streamline the costate it carries is nothing other than the local gradient of
the very same landscape. Necessary and sufficient, open-loop and closed-loop, ODE and PDE:
two cross-sections of one optimum.