The Algebraic Riccati Equation

The differential Riccati equation tracks a time-varying matrix P(t) backward from the terminal time. But its scalar picture revealed something: run the horizon long enough and P(t) stops moving, settling onto a constant plateau well before t = 0. Send the horizon to infinity and that plateau is all that remains. Solving for it directly turns a differential equation into an algebraic one.

From differential to algebraic

Consider the infinite-horizon LQ problem, T \to \infty, with no terminal cost. For a time-invariant system the cost-to-go from a given state no longer depends on when you start, so the value matrix is constant: P(t) \equiv P and therefore \dot{P} = 0. Drop the \dot{P} term from the differential Riccati equation and the time derivative vanishes, leaving a purely algebraic matrix equation — the Algebraic Riccati Equation (ARE):

A^{\mathsf{T}} P + P A - P B R^{-1} B^{\mathsf{T}} P + Q = 0.

It is the steady state of the Riccati flow. A symmetric matrix equation may have several solutions, but the one we want is the unique symmetric positive-semi-definite P \succeq 0 — the stabilising solution, the one that makes the closed loop stable. Its existence and uniqueness are guaranteed when the pair (A, B) is controllable (more precisely, stabilisable): you must be able to steer every unstable mode in order to pay a finite cost for it.

A fully worked scalar example

With scalar a, b, q, r the matrices are numbers and the transposes disappear. The ARE A^{\mathsf{T}}P + PA - PBR^{-1}B^{\mathsf{T}}P + Q = 0 becomes

2 a p - \frac{b^2}{r}\, p^2 + q = 0,

a quadratic in p. The 2ap is A^{\mathsf{T}}P + PA = 2ap; the b^2 p^2 / r is P B R^{-1} B^{\mathsf{T}} P.

Step 1 — put it in standard form. Multiply by -1 and order by powers of p:

\frac{b^2}{r}\, p^2 - 2 a p - q = 0.

Step 2 — apply the quadratic formula. With leading coefficient b^2/r,

p = \frac{2a \pm \sqrt{4a^2 + 4\,(b^2/r)\,q}}{2\,(b^2/r)} = \frac{a + \sqrt{a^2 + b^2 q / r}}{b^2 / r},

taking the + root. Step 3 — pick the stabilising root. The two roots have opposite signs (their product is -qr/b^2 \le 0); only the positive one gives p > 0, the genuine positive-definite cost matrix — the negative root would make the value function a downward parabola, not a cost. So we keep p_+ > 0.

Step 4 — read off the gain. The feedback gain is K = R^{-1} B^{\mathsf{T}} P = b p / r.

Plug in numbers. Take a = 0, b = 1, q = 1, r = 1. Then b^2/r = 1 and

p = \frac{0 + \sqrt{0 + 1}}{1} = 1, \qquad K = \frac{b\,p}{r} = \frac{1 \cdot 1}{1} = 1.

A clean p = 1 and gain K = 1 — exactly the u^\* = -x that the HJB worked example produced for \dot{x} = u, \int(x^2 + u^2). The general LQ machinery reproduces the hand-solved special case.

For the infinite-horizon problem \dot{P} = 0, and the differential Riccati equation becomes the ARE A^{\mathsf{T}} P + P A - P B R^{-1} B^{\mathsf{T}} P + Q = 0.
The relevant solution is the unique symmetric P \succeq 0 (the stabilising root), guaranteed when (A, B) is controllable/stabilisable.
Scalar case: (b^2/r)p^2 - 2ap - q = 0, take the positive root p_+ = \dfrac{a + \sqrt{a^2 + b^2 q/r}}{b^2/r}, gain K = bp_+/r.

The quadratic and its stabilising root

Fix b = 1 and plot the left-hand side g(p) = 2a\,p - p^2/r + q against p. It is a downward parabola, and the ARE asks where it crosses zero. Slide a, q and r: there are always two crossings of opposite sign, and the optimal cost is the positive one (the green marker). The readout gives that root p_+ and the corresponding gain K = p_+/r. Notice how raising q (caring more about the state) pushes the root — and the gain — up.

A matrix ARE is not solved by a quadratic formula — there is no such thing for matrices. The standard route is the Hamiltonian matrix \begin{bmatrix} A & -BR^{-1}B^{\mathsf{T}} \\ -Q & -A^{\mathsf{T}} \end{bmatrix}: its stable eigenvectors span an invariant subspace, and P is recovered from that subspace by a matrix division — itself an inverse. Every numerical library (MATLAB's care, SciPy's solve_continuous_are) does exactly this, returning the stabilising P in microseconds. The scalar quadratic here is just the one-dimensional shadow of that computation.