State Estimation
Every controller so far has been allowed to read the full state
x and feed back u = -Kx. Real hardware is
not so lucky: a sensor returns a few noisy numbers, not the whole state. A pendulum cart might
measure the cart position but not the pole's angular velocity; a satellite reads a star tracker,
not its full attitude and rate. We see a measurement
y = Cx + (\text{noise}),
where C picks out (a combination of) the states we can actually sense.
Before we can use the feedback law u = -Kx we must first
reconstruct x from this partial, noisy stream. That is
the job of a state estimator, and the cleanest one is the
Luenberger observer.
The naive idea, and why it fails
The first instinct is to run a copy of the model in software. We know the dynamics
\dot{x} = Ax + Bu and we know the input u we
are applying, so simulate
\dot{\hat{x}} = A\hat{x} + Bu
and call \hat{x} our estimate. Define the
estimation error e = x - \hat{x}. Subtracting the two
equations,
\dot{e} = \dot{x} - \dot{\hat{x}} = (Ax + Bu) - (A\hat{x} + Bu) = A e.
The error obeys the open-loop dynamics \dot{e} = Ae. If
A has any unstable eigenvalue — exactly the case for the
inverted
pendulum — the error grows, and a wrong initial guess
\hat{x}(0) never recovers. The open-loop copy ignores the one thing that
carries fresh information: the measurement.
The Luenberger observer
Fix it by correcting the copy with the measurement. Our model predicts the output
\hat{y} = C\hat{x}; the sensor reports y = Cx.
Their difference, the residual (or innovation)
r = y - C\hat{x},
is everything the measurement knows that the model does not. Feed it back through an
observer gain L:
\dot{\hat{x}} = A\hat{x} + Bu + L\big(y - C\hat{x}\big).
When the estimate is perfect the residual is zero and the observer reduces to the plain model copy;
when the estimate is wrong the residual nudges \hat{x} back toward the
truth. Now redo the error subtraction.
Step 1 — subtract the observer from the true dynamics. With
y = Cx,
\dot{e} = \dot{x} - \dot{\hat{x}} = \big(Ax + Bu\big) - \big(A\hat{x} + Bu + L(Cx - C\hat{x})\big).
Step 2 — cancel the common input and collect. The Bu
terms cancel exactly (the observer is driven by the same u), and
grouping the rest in x - \hat{x} = e,
\dot{e} = A(x - \hat{x}) - LC(x - \hat{x}) = (A - LC)\,e.
Step 3 — read off the error dynamics. The estimation error is governed by
\boxed{\;\dot{e} = (A - LC)\,e.\;}
We no longer suffer the bare A. The matrix
A - LC is ours to shape: choose the observer gain
L so that every eigenvalue of A - LC sits in
the left half-plane, and e(t) \to 0 — the estimate converges to the
true state, no matter how wrong the initial guess. The farther left we place the eigenvalues, the
faster the error decays.
-
Run a corrected model copy
\dot{\hat{x}} = A\hat{x} + Bu + L(y - C\hat{x}), driven by the
measurement residual y - C\hat{x}.
-
The estimation error e = x - \hat{x} obeys
\dot{e} = (A - LC)e.
-
The error is driven to zero by choosing L to place the eigenvalues
of A - LC in the left half-plane — possible exactly when the pair
(A, C) is
observable.
Why observability is the condition
Being able to place the eigenvalues of A - LC anywhere we like is not
automatic — it requires that the measurement y = Cx actually
sees every mode of the system. That is precisely
observability:
the pair (A, C) is observable iff the observability matrix has full rank,
\mathcal{O} = \begin{bmatrix} C \\ CA \\ CA^2 \\ \vdots \\ CA^{n-1} \end{bmatrix}, \qquad \operatorname{rank}\mathcal{O} = n.
When this holds, an L exists for any desired set of observer
eigenvalues. A mode the output cannot see is a mode the residual cannot correct — its error term in
\dot{e} = (A - LC)e is untouchable by L, and if
that mode is unstable the estimate diverges. Observability is to estimation exactly what
controllability is to control.
The duality with pole placement
Look at the two design problems side by side. For the controller we shape
A - BK by choosing K; for the observer we
shape A - LC by choosing L. They are the
same problem transposed. Taking the transpose of the observer matrix,
(A - LC)^{\mathsf{T}} = A^{\mathsf{T}} - C^{\mathsf{T}} L^{\mathsf{T}},
which has the identical shape as A - BK under the dictionary
A \;\leftrightarrow\; A^{\mathsf{T}}, \qquad B \;\leftrightarrow\; C^{\mathsf{T}}, \qquad K \;\leftrightarrow\; L^{\mathsf{T}}.
So an observer is just a controller designed for the dual system
(A^{\mathsf{T}}, C^{\mathsf{T}}), and observability of
(A, C) is exactly controllability of
(A^{\mathsf{T}}, C^{\mathsf{T}}). Eigenvalues are unchanged by transpose,
so placing the eigenvalues of A - LC is placing the eigenvalues
of the dual A^{\mathsf{T}} - C^{\mathsf{T}}L^{\mathsf{T}}. This duality is
the seed of the whole stage: when we make the design optimal rather than just stable, the controller
Riccati equation will reappear, transposed, as the estimator Riccati equation of the
Kalman
filter.
David Luenberger introduced the observer in 1964 as a graduate student, asking a disarmingly
simple question: if you cannot measure the whole state, can you build a dynamical system whose
output tracks it? The answer — yes, whenever the system is observable — turned state
feedback from a theorist's idealisation into something you could actually wire up, because you no
longer needed a sensor on every state. The Kalman filter, published three years earlier in 1961,
is the statistically optimal cousin of the same idea: choose L not just
to be stable, but to be the best possible trade-off against noise.
Watching the estimate converge
Take the scalar system \dot{x} = ax with a = -0.2
and measurement y = x (so c = 1). The true state
starts at x(0) = 2; the observer starts deliberately
wrong at \hat{x}(0) = -1. The error obeys
\dot{e} = (a - L)e, so e(t) = e(0)\,e^{(a-L)t}
and \hat{x}(t) = x(t) - e(t). Slide the observer gain
L: the closed-loop observer eigenvalue is
a - L, and the larger L is, the further left
that eigenvalue sits and the faster \hat{x} snaps onto the truth.