What Is Optimization?
Every time you pick the shortest queue, pack a suitcase to fit, or leave home at the last safe
moment, you are optimizing — searching a space of choices for the one that scores
best. Airlines route planes to burn the least fuel; a phone battery charges as fast as heat allows;
a portfolio chases the most return for a fixed risk. Optimization is the branch of
mathematics that makes "the best choice" precise and then finds it.
Three ingredients turn a vague wish into a solvable problem:
-
Decision variables \mathbf{x} = (x_1, \dots, x_n) —
the knobs you get to turn.
-
an objective function f(\mathbf{x}) — the single
number you want as small (or as large) as possible.
-
constraints — the rules the choice must obey, carving out the
feasible region of allowed \mathbf{x}.
Written out, the whole subject fits in one template — the standard form:
\min_{\mathbf{x}} \; f(\mathbf{x}) \quad\text{subject to}\quad
g_i(\mathbf{x}) \le 0,\quad h_j(\mathbf{x}) = 0.
The g_i are inequality constraints (budgets, capacities,
"at least this much"), the h_j are equality constraints
(something must balance exactly). The set of \mathbf{x} satisfying every
constraint is the feasible region \mathcal{F}; a point inside it is
feasible, a point outside is infeasible. Optimization asks for the
feasible point with the best objective value.
Minimize or maximize? Same problem
By convention we write everything as a minimization. That costs nothing, because maximizing
f is exactly minimizing -f:
\max_{\mathbf{x}\in\mathcal{F}} f(\mathbf{x}) = -\min_{\mathbf{x}\in\mathcal{F}} \bigl(-f(\mathbf{x})\bigr).
Flip the sign of the objective, solve the minimization, and flip the sign of the answer back. The
location \mathbf{x}^\star of the optimum is identical either way —
only the reported value changes sign. So a solver only ever needs to know how to go
downhill.
Likewise, a "\ge" constraint becomes a "\le"
one by negating it: g(\mathbf{x}) \ge 0 is the same rule as
-g(\mathbf{x}) \le 0. Every optimization problem can therefore be squeezed
into the one standard form above — which is why a single theory covers such a sprawling zoo of
applications.
Local versus global — the central difficulty
A point \mathbf{x}^\star is a local minimum if it beats
every feasible neighbour close by; it is a global minimum if it beats every feasible
point everywhere. A landscape can have many local minima but only the deepest is global.
Standing in a valley, you cannot tell by looking at your feet whether a deeper valley lies over the
next ridge — and that, in one sentence, is why optimization is hard.
The picture below is an objective f(x) = \sin x + 0.15\,x restricted to a
feasible interval. It dips into several local minima, each a little higher than the last because of
the upward drift. Reveal the steps: the feasible interval, then the local minima, then the single
global minimum among them. A greedy searcher that stopped at the first dip it found
would still be wrong about the whole picture.
A worked set-up: the fence problem
A farmer has 100 m of fencing and wants to enclose the largest possible
rectangular paddock against a straight river (so one side needs no fence). Turn the wish into
standard form.
Variables. Let the two sides perpendicular to the river each be
x and the side parallel be y.
Objective. Maximize the area A = xy — equivalently,
minimize -xy.
Constraints. The fence used is 2x + y = 100 (an equality),
and lengths are non-negative: x \ge 0,
y \ge 0 (inequalities).
Solve. The equality lets us eliminate y = 100 - 2x, so
A(x) = x(100 - 2x) = 100x - 2x^2. Then
A'(x) = 100 - 4x = 0 \Rightarrow x = 25, giving
y = 50 and area A = 1250 m². The feasible region
for x is the interval [0, 50]; the optimum sits
in its interior, and the boundary points x = 0 or
x = 50 give zero area. This is the whole optimization workflow in
miniature: name the variables, write the objective, list the constraints, then hunt the feasible
region for the best point.
In linear programming, integer programming, dynamic programming, the word
"programming" predates the electronic computer. It comes from military logistics of the 1940s, where
a program meant a schedule or plan of activities — how many tons of supplies
to move where and when. George Dantzig, who invented the simplex method in 1947, was literally
optimizing programs in that older sense: timetables and deployment plans for the U.S. Air Force. The
name stuck long after the meaning drifted, so "mathematical programming" is just the historical name
for optimization.
-
A local minimum is not automatically global. Finding a point where the objective
can't be improved locally tells you nothing about far-off valleys. Only special structure
— most importantly convexity
— promotes "local" to "global".
-
The optimum can hide on the boundary. Setting \nabla f = \mathbf{0}
only finds interior stationary points. On a constrained region the best feasible point may
sit on an edge or a corner where the gradient does not vanish at all — you must check the boundary
too.
-
Mind the sign when you flip max to min. Minimizing -f
finds the same location as maximizing f, but its reported value
is negated. Forget to flip it back and your "maximum profit" comes out negative.