The LQ Problem
We now meet the one optimal-control problem that yields completely to analysis — the
Linear-Quadratic (LQ) problem. Its name is its recipe: linear
dynamics and a quadratic cost. Everything earlier — the
cost
functional, the maximum principle, the
Pontryagin–dynamic-programming
comparison — was the general theory. Here that theory crystallises into formulas you can compute.
Linear dynamics, quadratic cost
The state evolves under a linear law, driven by the control through a constant
matrix:
\dot{x} = A\,x + B\,u, \qquad x(0) = x_0.
Here x \in \mathbb{R}^n is the state, u \in \mathbb{R}^m
the control, A the n\times n system matrix and
B the n\times m input matrix. This is the
linear
system whose free response is the matrix exponential e^{At}.
The cost is a quadratic form in the state and the control, integrated over the
horizon and capped by a terminal penalty:
J = \tfrac12 \int_0^T \Big( x^{\mathsf{T}} Q\,x + u^{\mathsf{T}} R\,u \Big)\,dt \;+\; \tfrac12\, x(T)^{\mathsf{T}} S\, x(T).
The factor \tfrac12 is a convention that keeps later derivatives tidy.
Each term is a quadratic
form: x^{\mathsf{T}} Q x charges the state for drifting from
zero, u^{\mathsf{T}} R u charges control effort, and
x(T)^{\mathsf{T}} S x(T) charges the final miss.
The conditions on the weights
The weight matrices are symmetric, and their definiteness is exactly what makes
the problem well-posed.
-
Q \succeq 0 and S \succeq 0
(positive-semi-definite):
a state penalty is never negative, so straying from the target can only ever cost you.
Semi-definite is enough — some directions in the state may simply not be penalised.
-
R \succ 0 (strictly positive-definite): every
non-zero control incurs a genuine, strictly positive penalty.
Why R must be strictly positive-definite. If some
non-zero control direction u_\star had
u_\star^{\mathsf{T}} R\, u_\star = 0, then effort along
u_\star would be free: the optimiser could pour unlimited,
impulsive control in that direction at no charge, and the minimisation would have no finite
solution. Demanding R \succ 0 — every eigenvalue of
R strictly positive — closes that loophole, so the optimal control stays
finite and unique. It is the positive-definite R that makes the cost a
genuine bowl in u, with a single bottom to roll to. And
because R is invertible, R^{-1} exists — a fact
the optimal control law will lean on directly.
-
Dynamics are linear:
\dot{x} = A x + B u, x(0) = x_0.
-
Cost is quadratic:
J = \tfrac12\int_0^T (x^{\mathsf{T}} Q x + u^{\mathsf{T}} R u)\,dt + \tfrac12 x(T)^{\mathsf{T}} S x(T).
-
Weights are symmetric with
Q \succeq 0, S \succeq 0 and
R \succ 0; the strict positive-definiteness of
R guarantees a finite, unique optimum and an invertible
R.
The harmonic oscillator of control theory
Why does this one problem deserve a whole stage? Because it plays the role in control that the
simple harmonic oscillator plays in physics: the one nontrivial case solvable in
closed form, and the local model of everything else.
-
It is exactly solvable. A quadratic cost minimised over linear dynamics has a
value function that is itself quadratic, and the optimal control turns out to be a simple linear
feedback. No other interesting class collapses so cleanly — the rest of this stage derives that
collapse, term by term.
-
It is the universal local approximation. Take any smooth nonlinear
problem and sit at an operating point. Linearise the dynamics there (the
Jacobian gives A and B) and
quadraticise the cost (the Hessian gives Q and
R), and the LQ problem is what you are left with. So the LQ solution is
the leading-order controller for almost any system you will ever meet — which is why we invest in
solving it exactly.
Steepening the bowl
The quadratic cost is, in each variable, a bowl. Take the scalar slice
\tfrac12 q\,x^2 for the state and
\tfrac12 r\,u^2 for the control. Slide the weights
q and r and watch each parabola steepen: a
larger weight makes that bowl narrower, so the optimiser is punished harder for the slightest
deviation. With q, r > 0 both are genuine bowls curving up from the
origin — the geometric picture behind “positive-definite”. That balance between the two
steepnesses is the whole design knob of quadratic control.
The combination is not an accident of convenience — it is the richest pairing that still keeps
the door to closed-form solution open. Linear dynamics are the first term of any Taylor
expansion of the true dynamics; a quadratic cost is the first nontrivial term of any smooth
penalty (the constant and linear terms vanish at a well-chosen operating point). So
“linear plus quadratic” is precisely the second-order model of reality, and the
next
page shows that solving it reduces the entire problem to a single matrix
differential equation.