Model-Predictive Control

The LQR feedback law u = -Kx is a thing of beauty — a closed form, computed once, that reacts instantly to any state. But it has a blind spot: it knows nothing about limits. A real actuator saturates, a tank must not overflow, a robot arm cannot pass through a wall. Model-predictive control (MPC) earns its place as the most-used advanced controller in industry by handling exactly these constraints — and it does so with a single, almost obvious idea: at every instant, look ahead, plan, but commit to only the next step.

The receding-horizon loop

MPC is also called receding-horizon control, and the name tells the whole story. Rather than solving the problem once for all time, MPC re-solves a short finite-horizon cost over and over, each time starting from the state it actually measures. Here is the loop, run once per sampling instant k.

Step 1 — measure the current state. Read x_k from the plant right now. This fresh measurement is what makes MPC a feedback law and not a one-shot plan.

Step 2 — solve a finite-horizon optimal-control problem. Over the next N steps, choose a whole sequence of moves u_{k}, u_{k+1}, \dots, u_{k+N-1} to minimise the predicted cost, subject to the model and every constraint:

\min_{u_{k},\,\dots,\,u_{k+N-1}}\ \sum_{j=0}^{N-1} g\big(x_{k+j},\, u_{k+j}\big) + \varphi\big(x_{k+N}\big) \quad \text{s.t.}\quad x_{k+j+1} = f(x_{k+j}, u_{k+j}),\ \ u \in \mathcal{U},\ \ x \in \mathcal{X}.

Step 3 — apply only the first move. The solver returns a full plan, but MPC throws almost all of it away. It applies just the very first control,

u_k = u_{k}^{\*},

and discards u_{k+1}^{\*}, \dots, u_{k+N-1}^{\*}.

Step 4 — advance and re-solve. The plant moves to x_{k+1}, the clock ticks to k+1, and we return to Step 1. The horizon has slid one step forward — it recedes ahead of us like a horizon you can never reach — and the freshly measured state folds any disturbance straight back into the next plan. That re-measurement is the feedback that the optimisation alone could never provide.

Why industry runs on it: constraints

The reason MPC dominates chemical plants, refineries, power converters and self-driving cars is the clause hiding in Step 2: u \in \mathcal{U},\ x \in \mathcal{X}. LQR's closed form u = -Kx has no way to express “the valve cannot open past 100%” or “the temperature must stay below the limit”; if the ideal gain demands more than the actuator can give, the formula simply lies. MPC instead solves a genuine constrained optimisation online — when g is quadratic and the dynamics and constraints are linear, exactly a quadratic program (QP) — so the limits are honoured by construction. It is the same constrained optimisation machinery we met for equality constraints, now carrying inequalities too, and solved afresh every sampling instant.

Nothing is free. That online solve is the trade-off: where LQR multiplies a precomputed matrix by the state in microseconds, MPC must solve a fresh QP within each sampling period. Faster computers and slick QP solvers are precisely what turned MPC from a 1980s refinery curiosity into the default advanced controller everywhere.

At each instant, MPC solves a finite-horizon optimal-control problem from the current measured state, applies only the first control move, then re-solves at the next instant — the receding horizon.
Its decisive advantage over LQR is handling constraints on inputs and states directly, by solving a constrained optimisation (a QP in the linear-quadratic case) online.
The price is online computation: a fresh optimisation every sampling period instead of one precomputed gain.

MPC contains LQR

The cleanest way to see what MPC is doing is to switch every constraint off and push the horizon to infinity. With no limits binding and N \to \infty, the linear-quadratic MPC problem is exactly the infinite-horizon LQ problem — and its receding-horizon solution is the constant gain u = -Kx from the Algebraic Riccati Equation. In a slogan:

\text{unconstrained, infinite-horizon MPC} \;=\; \text{LQR}.

So MPC is not a rival to LQR but a generalisation of it: keep the same quadratic cost and linear model, add the ability to obey hard limits and to re-plan with a fresh measurement, and pay for it in online compute. When the constraints never bite, MPC quietly reproduces the optimal feedback law we already trust.

Watching the horizon recede

Below, a scalar plant is driven toward the target x = 0. Slide current time k forward and watch the loop run. The solid curve is what MPC has actually executed up to now; the shaded band is the prediction window of N steps it is planning over right now; the dashed curve inside it is the freshly re-optimised plan; and the highlighted segment is the single first move MPC will actually commit to before re-measuring and re-solving. As k advances, the window slides forward — the horizon receding — and only one executed step is added each time.

MPC was born in the oil refineries of the late 1970s, where a unit might have dozens of coupled inputs and outputs all pressing against safety and quality limits — exactly the constrained, multivariable setting where a single LQR gain is helpless. For decades it lived mostly in slow chemical processes, because the online QP was too heavy to solve in real time for anything fast. Then solvers and silicon caught up. Today MPC steers the trajectory of self-driving cars and autonomous-racing platforms at the limits of tyre grip, schedules power in grid-scale batteries, and regulates the plasma in fusion reactors — anywhere the future matters and the limits are hard.