State Estimation

Every controller so far has been allowed to read the full state x and feed back u = -Kx. Real hardware is not so lucky: a sensor returns a few noisy numbers, not the whole state. A pendulum cart might measure the cart position but not the pole's angular velocity; a satellite reads a star tracker, not its full attitude and rate. We see a measurement

y = Cx + (\text{noise}),

where C picks out (a combination of) the states we can actually sense. Before we can use the feedback law u = -Kx we must first reconstruct x from this partial, noisy stream. That is the job of a state estimator, and the cleanest one is the Luenberger observer.

The naive idea, and why it fails

The first instinct is to run a copy of the model in software. We know the dynamics \dot{x} = Ax + Bu and we know the input u we are applying, so simulate

\dot{\hat{x}} = A\hat{x} + Bu

and call \hat{x} our estimate. Define the estimation error e = x - \hat{x}. Subtracting the two equations,

\dot{e} = \dot{x} - \dot{\hat{x}} = (Ax + Bu) - (A\hat{x} + Bu) = A e.

The error obeys the open-loop dynamics \dot{e} = Ae. If A has any unstable eigenvalue — exactly the case for the inverted pendulum — the error grows, and a wrong initial guess \hat{x}(0) never recovers. The open-loop copy ignores the one thing that carries fresh information: the measurement.

The Luenberger observer

Fix it by correcting the copy with the measurement. Our model predicts the output \hat{y} = C\hat{x}; the sensor reports y = Cx. Their difference, the residual (or innovation)

r = y - C\hat{x},

is everything the measurement knows that the model does not. Feed it back through an observer gain L:

\dot{\hat{x}} = A\hat{x} + Bu + L\big(y - C\hat{x}\big).

When the estimate is perfect the residual is zero and the observer reduces to the plain model copy; when the estimate is wrong the residual nudges \hat{x} back toward the truth. Now redo the error subtraction.

Step 1 — subtract the observer from the true dynamics. With y = Cx,

\dot{e} = \dot{x} - \dot{\hat{x}} = \big(Ax + Bu\big) - \big(A\hat{x} + Bu + L(Cx - C\hat{x})\big).

Step 2 — cancel the common input and collect. The Bu terms cancel exactly (the observer is driven by the same u), and grouping the rest in x - \hat{x} = e,

\dot{e} = A(x - \hat{x}) - LC(x - \hat{x}) = (A - LC)\,e.

Step 3 — read off the error dynamics. The estimation error is governed by

\boxed{\;\dot{e} = (A - LC)\,e.\;}

We no longer suffer the bare A. The matrix A - LC is ours to shape: choose the observer gain L so that every eigenvalue of A - LC sits in the left half-plane, and e(t) \to 0 — the estimate converges to the true state, no matter how wrong the initial guess. The farther left we place the eigenvalues, the faster the error decays.

Why observability is the condition

Being able to place the eigenvalues of A - LC anywhere we like is not automatic — it requires that the measurement y = Cx actually sees every mode of the system. That is precisely observability: the pair (A, C) is observable iff the observability matrix has full rank,

\mathcal{O} = \begin{bmatrix} C \\ CA \\ CA^2 \\ \vdots \\ CA^{n-1} \end{bmatrix}, \qquad \operatorname{rank}\mathcal{O} = n.

When this holds, an L exists for any desired set of observer eigenvalues. A mode the output cannot see is a mode the residual cannot correct — its error term in \dot{e} = (A - LC)e is untouchable by L, and if that mode is unstable the estimate diverges. Observability is to estimation exactly what controllability is to control.

The duality with pole placement

Look at the two design problems side by side. For the controller we shape A - BK by choosing K; for the observer we shape A - LC by choosing L. They are the same problem transposed. Taking the transpose of the observer matrix,

(A - LC)^{\mathsf{T}} = A^{\mathsf{T}} - C^{\mathsf{T}} L^{\mathsf{T}},

which has the identical shape as A - BK under the dictionary

A \;\leftrightarrow\; A^{\mathsf{T}}, \qquad B \;\leftrightarrow\; C^{\mathsf{T}}, \qquad K \;\leftrightarrow\; L^{\mathsf{T}}.

So an observer is just a controller designed for the dual system (A^{\mathsf{T}}, C^{\mathsf{T}}), and observability of (A, C) is exactly controllability of (A^{\mathsf{T}}, C^{\mathsf{T}}). Eigenvalues are unchanged by transpose, so placing the eigenvalues of A - LC is placing the eigenvalues of the dual A^{\mathsf{T}} - C^{\mathsf{T}}L^{\mathsf{T}}. This duality is the seed of the whole stage: when we make the design optimal rather than just stable, the controller Riccati equation will reappear, transposed, as the estimator Riccati equation of the Kalman filter.

David Luenberger introduced the observer in 1964 as a graduate student, asking a disarmingly simple question: if you cannot measure the whole state, can you build a dynamical system whose output tracks it? The answer — yes, whenever the system is observable — turned state feedback from a theorist's idealisation into something you could actually wire up, because you no longer needed a sensor on every state. The Kalman filter, published three years earlier in 1961, is the statistically optimal cousin of the same idea: choose L not just to be stable, but to be the best possible trade-off against noise.

Watching the estimate converge

Take the scalar system \dot{x} = ax with a = -0.2 and measurement y = x (so c = 1). The true state starts at x(0) = 2; the observer starts deliberately wrong at \hat{x}(0) = -1. The error obeys \dot{e} = (a - L)e, so e(t) = e(0)\,e^{(a-L)t} and \hat{x}(t) = x(t) - e(t). Slide the observer gain L: the closed-loop observer eigenvalue is a - L, and the larger L is, the further left that eigenvalue sits and the faster \hat{x} snaps onto the truth.