The Cost Functional

Every optimal control problem turns on a single scalar score: the cost functional. It is a functional — a machine that eats an entire control history u(\cdot) (and the trajectory it produces) and returns one number to be minimised. Its general form has two pieces:

J[u] = \underbrace{\phi\big(x(T)\big)}_{\text{terminal cost}} + \int_0^T \underbrace{L\big(x(t), u(t), t\big)}_{\text{running cost}} \, dt.

The terminal cost \phi scores only where we end up — how close the rocket is to the landing pad at the final instant. The running cost L accumulates a penalty at every moment along the way — fuel burned, error sustained, energy spent. Their sum is what the controller is paid, and wants, to make small.

Three classical forms

Which of the two pieces you keep gives the problem its traditional name. All three are equivalent — a clever change of variables converts any one into another — but each is natural for different problems.

That they are interconvertible is a small but useful fact: adjoin a new state x_{n+1} with \dot{x}_{n+1} = L and x_{n+1}(0) = 0, and the running integral becomes the terminal value x_{n+1}(T) — turning a Lagrange problem into a Mayer one.

The quadratic cost

One running cost dominates applications because it is the only one we can solve in closed form and because it captures the universal engineering trade-off — hit the target without thrashing the actuator. It is quadratic in the state and the control:

J = \int_0^T \Big( x^{\mathsf{T}} Q\, x + u^{\mathsf{T}} R\, u \Big)\, dt + x(T)^{\mathsf{T}} S\, x(T).

Each term is a quadratic form. The x^{\mathsf{T}} Q x term penalises the state for straying from zero (tracking error); u^{\mathsf{T}} R u penalises control effort (fuel, energy); and x(T)^{\mathsf{T}} S x(T) penalises the final miss. The weight matrices are symmetric, and their definiteness is exactly what makes the problem well-posed:

Why R must be strictly positive-definite. Suppose instead some non-zero control direction u_\star had u_\star^{\mathsf{T}} R\, u_\star = 0. Then control along u_\star is free — the optimiser could pour unlimited effort in that direction at no charge, driving the state to zero with an infinite, impulsive control. The minimisation has no finite, well-defined solution. Requiring R \succ 0 — equivalently, all eigenvalues of R strictly positive — closes that loophole: every bit of control costs something, so the optimal control stays finite and unique. It is the positive-definite R that makes the bowl genuinely bowl-shaped in u, with a single bottom to roll to.

Trade state error against effort

Fix a single run: a state that decays as x(t) = 2e^{-t} while a steady control u(t) = -1 is applied over [0, 3]. The scalar running cost is J = \int_0^3 \big(q\,x(t)^2 + r\,u(t)^2\big)\,dt, which splits into a state part and a control part. Slide the weights q and r and watch the two contributions, and their total, change. Crank q up and the cost is dominated by state error — the controller would work harder to crush x; crank r up and effort dominates — it would rather let the state drift than spend control. That balance is the whole design knob of quadratic control.

There is no “correct” cost functional handed down by physics — choosing Q, R and S is an act of engineering judgement, and it is where the designer's intent enters the mathematics. A heavy R yields a gentle, fuel-sipping controller that responds slowly; a heavy Q yields an aggressive one that tracks tightly but burns effort and may saturate the actuator. Tuning these weights — often by trial, simulation and taste — is the daily craft of control engineering, and the reason two engineers handed the same plant can build very different, each “optimal”, controllers.