Model-Predictive Control
The
LQR feedback
law u = -Kx is a thing of beauty — a closed form, computed
once, that reacts instantly to any state. But it has a blind spot: it knows nothing about
limits. A real actuator saturates, a tank must not overflow, a robot arm cannot pass
through a wall. Model-predictive control (MPC) earns its place as the most-used
advanced controller in industry by handling exactly these constraints — and it does so with a single,
almost obvious idea: at every instant, look ahead, plan, but commit to only the next step.
The receding-horizon loop
MPC is also called receding-horizon control, and the name tells the whole story.
Rather than solving the problem once for all time, MPC re-solves a short
finite-horizon
cost over and over, each time starting from the state it actually measures. Here is the
loop, run once per sampling instant k.
Step 1 — measure the current state. Read x_k from the
plant right now. This fresh measurement is what makes MPC a feedback law and not a one-shot plan.
Step 2 — solve a finite-horizon optimal-control problem. Over the next
N steps, choose a whole sequence of moves
u_{k}, u_{k+1}, \dots, u_{k+N-1} to minimise the predicted cost, subject to
the model and every constraint:
\min_{u_{k},\,\dots,\,u_{k+N-1}}\ \sum_{j=0}^{N-1} g\big(x_{k+j},\, u_{k+j}\big) + \varphi\big(x_{k+N}\big) \quad \text{s.t.}\quad x_{k+j+1} = f(x_{k+j}, u_{k+j}),\ \ u \in \mathcal{U},\ \ x \in \mathcal{X}.
Step 3 — apply only the first move. The solver returns a full plan, but MPC throws
almost all of it away. It applies just the very first control,
u_k = u_{k}^{\*},
and discards u_{k+1}^{\*}, \dots, u_{k+N-1}^{\*}.
Step 4 — advance and re-solve. The plant moves to
x_{k+1}, the clock ticks to k+1, and we return
to Step 1. The horizon has slid one step forward — it recedes ahead of us like a horizon you
can never reach — and the freshly measured state folds any disturbance straight back into the next
plan. That re-measurement is the feedback that the optimisation alone could never provide.
Why industry runs on it: constraints
The reason MPC dominates chemical plants, refineries, power converters and self-driving cars is the
clause hiding in Step 2: u \in \mathcal{U},\ x \in \mathcal{X}. LQR's
closed form u = -Kx has no way to express “the valve cannot open past
100%” or “the temperature must stay below the limit”; if the ideal gain demands more
than the actuator can give, the formula simply lies. MPC instead solves a genuine
constrained optimisation online — when g is quadratic and
the dynamics and constraints are linear, exactly a quadratic program (QP) — so the
limits are honoured by construction. It is the same
constrained optimisation
machinery we met for equality constraints, now carrying inequalities too, and solved afresh every
sampling instant.
Nothing is free. That online solve is the trade-off: where LQR multiplies a precomputed matrix by the
state in microseconds, MPC must solve a fresh QP within each sampling period. Faster computers and
slick QP solvers are precisely what turned MPC from a 1980s refinery curiosity into the default
advanced controller everywhere.
-
At each instant, MPC solves a finite-horizon optimal-control problem from the
current measured state, applies only the first control move, then re-solves at
the next instant — the receding horizon.
-
Its decisive advantage over LQR is handling constraints on inputs and states
directly, by solving a constrained optimisation (a QP in the linear-quadratic case) online.
-
The price is online computation: a fresh optimisation every sampling period
instead of one precomputed gain.
MPC contains LQR
The cleanest way to see what MPC is doing is to switch every constraint off and push the horizon to
infinity. With no limits binding and N \to \infty, the linear-quadratic MPC
problem is exactly the infinite-horizon LQ problem — and its receding-horizon solution is the constant
gain u = -Kx from the
Algebraic
Riccati Equation. In a slogan:
\text{unconstrained, infinite-horizon MPC} \;=\; \text{LQR}.
So MPC is not a rival to LQR but a generalisation of it: keep the same quadratic cost and linear model,
add the ability to obey hard limits and to re-plan with a fresh measurement, and pay for it in online
compute. When the constraints never bite, MPC quietly reproduces the optimal feedback law we already
trust.
Watching the horizon recede
Below, a scalar plant is driven toward the target x = 0. Slide
current time k forward and watch the loop run.
The solid curve is what MPC has actually executed up to now; the shaded band is the
prediction window of N steps it is planning over right now;
the dashed curve inside it is the freshly re-optimised plan; and the highlighted segment is the
single first move MPC will actually commit to before re-measuring and re-solving. As
k advances, the window slides forward — the horizon receding — and only one
executed step is added each time.
MPC was born in the oil refineries of the late 1970s, where a unit might have dozens of coupled
inputs and outputs all pressing against safety and quality limits — exactly the constrained,
multivariable setting where a single LQR gain is helpless. For decades it lived mostly in slow
chemical processes, because the online QP was too heavy to solve in real time for anything fast.
Then solvers and silicon caught up. Today MPC steers the trajectory of self-driving cars and
autonomous-racing platforms at the limits of tyre grip, schedules power in grid-scale batteries,
and regulates the plasma in fusion reactors — anywhere the future matters and the limits are hard.