The Cost Functional
Every optimal control problem turns on a single scalar score: the cost
functional. It is a
functional
— a machine that eats an entire control history u(\cdot) (and the
trajectory it produces) and returns one number to be minimised. Its general form has two
pieces:
J[u] = \underbrace{\phi\big(x(T)\big)}_{\text{terminal cost}} + \int_0^T \underbrace{L\big(x(t), u(t), t\big)}_{\text{running cost}} \, dt.
The terminal cost \phi scores only where we end
up — how close the rocket is to the landing pad at the final instant. The running
cost L accumulates a penalty at every moment along the
way — fuel burned, error sustained, energy spent. Their sum is what the controller is paid,
and wants, to make small.
Three classical forms
Which of the two pieces you keep gives the problem its traditional name. All three are
equivalent — a clever change of variables converts any one into another — but each is natural
for different problems.
-
Mayer form — terminal cost only,
J = \phi(x(T)). Natural when all that matters is the final
condition: maximise a spacecraft's final altitude, hit a target.
-
Lagrange form — running cost only,
J = \int_0^T L\,dt. Natural for accumulated quantities: total
fuel, total time (L = 1 gives
J = T, a minimum-time problem), total tracking error.
-
Bolza form — both together,
J = \phi(x(T)) + \int_0^T L\,dt. The general case, and the one
we will usually write.
That they are interconvertible is a small but useful fact: adjoin a new state
x_{n+1} with \dot{x}_{n+1} = L and
x_{n+1}(0) = 0, and the running integral becomes the terminal value
x_{n+1}(T) — turning a Lagrange problem into a Mayer one.
The quadratic cost
One running cost dominates applications because it is the only one we can solve in closed
form and because it captures the universal engineering trade-off — hit the target without
thrashing the actuator. It is quadratic in the state and the control:
J = \int_0^T \Big( x^{\mathsf{T}} Q\, x + u^{\mathsf{T}} R\, u \Big)\, dt + x(T)^{\mathsf{T}} S\, x(T).
Each term is a
quadratic form.
The x^{\mathsf{T}} Q x term penalises the state for straying from
zero (tracking error); u^{\mathsf{T}} R u penalises control effort
(fuel, energy); and x(T)^{\mathsf{T}} S x(T) penalises the final
miss. The weight matrices are symmetric, and their definiteness is exactly
what makes the problem well-posed:
-
Q \succeq 0 and S \succeq 0
(positive-semi-definite):
a state penalty is never negative — wandering from the target can only cost you, never pay
you.
-
R \succ 0 (strictly
positive-definite): every non-zero control incurs a genuine, strictly
positive penalty.
Why R must be strictly positive-definite. Suppose
instead some non-zero control direction u_\star had
u_\star^{\mathsf{T}} R\, u_\star = 0. Then control along
u_\star is free — the optimiser could pour unlimited effort
in that direction at no charge, driving the state to zero with an infinite, impulsive
control. The minimisation has no finite, well-defined solution. Requiring
R \succ 0 — equivalently, all eigenvalues of
R strictly positive — closes that loophole: every bit of control
costs something, so the optimal control stays finite and unique. It is the positive-definite
R that makes the bowl genuinely bowl-shaped in
u, with a single bottom to roll to.
- The general (Bolza) cost is
J[u] = \phi(x(T)) + \int_0^T L(x, u, t)\,dt: a terminal plus a
running cost.
- Mayer keeps only \phi(x(T));
Lagrange keeps only the integral; all three are interconvertible.
- The quadratic cost
\int_0^T (x^{\mathsf{T}}Qx + u^{\mathsf{T}}Ru)\,dt + x(T)^{\mathsf{T}}Sx(T)
needs Q, S \succeq 0 and
R \succ 0.
- R \succ 0 is essential: it makes every control penalised, so
the optimum is finite and unique.
Trade state error against effort
Fix a single run: a state that decays as x(t) = 2e^{-t} while a
steady control u(t) = -1 is applied over
[0, 3]. The scalar running cost is
J = \int_0^3 \big(q\,x(t)^2 + r\,u(t)^2\big)\,dt, which splits into a
state part and a control part. Slide the weights q and
r and watch the two contributions, and their total, change. Crank
q up and the cost is dominated by state error — the controller
would work harder to crush x; crank r up
and effort dominates — it would rather let the state drift than spend control. That balance is
the whole design knob of quadratic control.
There is no “correct” cost functional handed down by physics — choosing
Q, R and S is
an act of engineering judgement, and it is where the designer's intent enters the
mathematics. A heavy R yields a gentle, fuel-sipping controller
that responds slowly; a heavy Q yields an aggressive one that
tracks tightly but burns effort and may saturate the actuator. Tuning these weights — often
by trial, simulation and taste — is the daily craft of control engineering, and the reason
two engineers handed the same plant can build very different, each “optimal”,
controllers.