LQG and the Separation Principle
This is the summit of the stage. We have built two optimal machines independently: the
LQR optimal
feedback law u = -Kx, which knows what to do if it
can see the state, and the
Kalman
filter, which reconstructs the state from noisy partial measurements. The obvious
question is whether they fit together — and the answer is one of the most elegant results in all of
control theory.
The setting is LQG: Linear dynamics, Quadratic
cost, Gaussian noise — the stochastic version of LQR where the controller no longer
sees the state, only noisy measurements of it:
dx = (Ax + Bu)\,dt + w, \qquad y = Cx + v,
with process noise w (covariance Q_n) and
measurement noise v (covariance R_n), and the
quadratic cost J = \mathbb{E}\big[\int_0^\infty (x^{\mathsf{T}} Q x + u^{\mathsf{T}} R u)\,dt\big].
The separation principle
Here is the punchline. The optimal LQG controller is simply the LQR feedback gain
K applied to the Kalman-filtered state estimate
\hat{x}:
\boxed{\;u^\* = -K\,\hat{x}.\;}
You do not co-design the controller and the estimator in some tangled joint optimisation. You design
them separately — each optimal in its own right — and bolt them together. The result
is globally optimal. Concretely:
-
Design the controller as if there were no noise. Solve the LQR problem for the gain
K = R^{-1} B^{\mathsf{T}} P_c, where P_c solves
the control Riccati equation — exactly the deterministic LQR design, blind to the
measurement noise.
-
Design the estimator as if there were no control objective. Build the Kalman gain
K_f = P_e C^{\mathsf{T}} R_n^{-1}, where P_e
solves the estimation Riccati equation — exactly the Kalman design, blind to the
cost weights Q, R.
-
Plug the estimate into the gain. Run the Kalman filter to produce
\hat{x}, and feed back u^\* = -K\hat{x}. This
combination minimises the expected quadratic cost — no better controller exists.
Estimation and control decouple. Two problems that looked hopelessly entangled —
how should my steering depend on how well I can see? — turn out to be solvable one at a time.
-
The optimal LQG control is u^\* = -K\hat{x}: the LQR gain
K applied to the Kalman estimate \hat{x}.
-
The optimal controller and optimal estimator are designed separately and are
each optimal for their own subproblem; their combination is globally optimal.
-
Two independent Riccati equations run side by side — the control Riccati
(for K) and the estimation Riccati
(for K_f) — neither depending on the other.
Certainty equivalence
There is a second, equivalent way to state the miracle, called certainty
equivalence: act as if your best estimate were the true state. The optimal
controller uses the very same gain K it would use with perfect
state knowledge — it simply substitutes \hat{x} for the unavailable
x and proceeds with full confidence, as though the estimate carried no
uncertainty at all.
What makes this remarkable is what the controller is allowed to ignore. The estimation-error
covariance P_e — how uncertain the estimate is — does not
change the gain. A foggier sensor produces a worse \hat{x}, and therefore a
higher expected cost, but the controller's response to the estimate it has is unchanged. The
feedback law does not become timid in proportion to its uncertainty; certainty equivalence says the
optimal policy treats the estimate as the truth. (This clean decoupling is special to the linear
Gaussian world; for nonlinear systems the controller generally should hedge against its
uncertainty, and separation fails.)
The two Riccati equations make the decoupling vivid. The control Riccati is a backward, forward-design
sweep producing K; the estimation Riccati is a forward sweep producing
K_f. They share no variables. The controller designer and the filter
designer can work in separate rooms and the assembled system is still optimal — the transpose duality
we first met for the
feedback
law closing into a complete, optimal, output-feedback controller.
The closed loop, assembled
Putting the pieces in series gives the full LQG compensator. The plant is driven by
u; its noisy output y feeds the Kalman filter,
which emits the estimate \hat{x}; the gain produces
u = -K\hat{x}, which drives the plant — a closed loop of estimate and
action:
\text{plant} \;\xrightarrow{\;y = Cx + v\;}\; \text{Kalman filter} \;\xrightarrow{\;\hat{x}\;}\; (-K) \;\xrightarrow{\;u = -K\hat{x}\;}\; \text{plant}.
Internally the estimator carries its own dynamics,
\dot{\hat{x}} = A\hat{x} + Bu + K_f(y - C\hat{x}), while the true state
obeys dx = (Ax + Bu)\,dt + w. The combined closed loop has eigenvalues that
split cleanly into the controller eigenvalues of A - BK
and the estimator eigenvalues of A - K_f C — the separation
principle visible in the spectrum itself. Design each set where you like, independently.
The separation principle is not just pretty — it is why optimal output-feedback control was
tractable enough to fly. The Apollo guidance and navigation system was, in essence, a Kalman filter
fusing star-tracker and accelerometer data into a state estimate, feeding a control law that steered
the spacecraft — estimator and controller designed apart and joined, exactly as separation permits.
The same architecture runs underneath modern aerospace: aircraft autopilots, missile guidance,
satellite attitude control, and the inertial-navigation systems in every airliner are LQG
controllers in spirit, an optimal filter handing a clean state to an optimal regulator. Rudolf
Kálmán's filter and the LQR regulator, separately optimal, became jointly indispensable.
Watching estimate and control close the loop
Below, the scalar plant dx = (ax + u)\,dt + \sigma\,dW (with
a = 0.3, an unstable plant) is steered by
u = -K\hat{x}, where \hat{x} comes from a Kalman
observer \dot{\hat{x}} = a\hat{x} + u + K_f(x - \hat{x}). The true state
starts at x(0) = 2; the estimate starts wrong at
\hat{x}(0) = 0. Watch \hat{x} lock onto
x (the estimator working) while the feedback drives both toward
zero (the controller working). Raise the noise \sigma and the pair jitters
more, but the loop keeps the unstable state regulated — separately designed gains, jointly holding
the system.