Root Finding: Bisection and Newton
Solve x^2 - 2 = 0 and you may write down
x = \sqrt{2} with a flourish — but that symbol is a promise, not a
number. Ask what \sqrt2 actually is, to a dozen decimal places, and
the flourish evaporates: 1.41421356\ldots comes from somewhere, and that
somewhere is an algorithm. Now try x - \cos x = 0, or
x\,e^{x} = 1, or a fifth-degree polynomial. Here there is not even a symbol
to hide behind: Abel and Galois proved in the 1820s that no formula in radicals can
solve the general quintic, and transcendental mixes of polynomials with sines and exponentials are
hopeless from the start. The equations of real mathematics are, overwhelmingly, equations you
cannot solve in closed form.
So we change the question. Instead of "what is x?", we ask "for which
x does some function f hit zero?", and we build a
machine that hunts the crossing to as many digits as we please. This is root finding —
solving f(x) = 0 by iteration — and it is the beating heart of numerical
analysis. This page teaches the two methods every mathematician should know cold: bisection,
the tortoise that is slow but never fails, and Newton–Raphson, the hare that is
blisteringly fast but occasionally hurls itself off a cliff. You will meet both, watch them race to
\sqrt2, and learn exactly when to trust which.
What "solve f(x)=0" really means
A root (or zero) of a function f is a value
x^\* with f(x^\*) = 0 — geometrically, a point
where the graph of f pierces the horizontal axis. Almost every "solve for
x" problem can be pushed into this shape by moving everything to one side:
g(x) = h(x) becomes f(x) = g(x) - h(x) = 0. Want a
fixed point x = \cos x? Root-find on f(x) = x - \cos x.
Want to invert a function, y = \phi(x) for a known
y? Root-find on f(x) = \phi(x) - y. The single
template "drive a function to zero" swallows an astonishing fraction of computational mathematics.
Here is a concrete target for the rest of the page: the function
f(x) = x^2 - 2, whose positive root is exactly
\sqrt 2. Watch where the curve slices through the axis — that single
crossing between x=1 and x=2 is the number we are
hunting.
Bisection: the Intermediate Value Theorem, made mechanical
The safest root-finder rests on a theorem you have believed since before you could name it. If a
continuous function is negative somewhere and positive somewhere else, then between
those two places it must cross zero — it cannot leap the axis without touching it. That is the
Intermediate Value Theorem, and it hands us a way to trap a root. Find two
points a and b whose function values have
opposite signs — a bracket:
f(a)\cdot f(b) < 0.
The root is somewhere inside [a,b]. Now squeeze. Evaluate the
midpoint m = (a+b)/2 and look at the sign of
f(m). The sign change must survive in exactly one of the two halves
[a,m] or [m,b] — so keep that half and discard
the other. Repeat. Each step the interval containing the root gets half as wide, and
the root can never slip out.
-
Requirement. A continuous f and a bracket
[a,b] with f(a)\,f(b) < 0 (a genuine sign
change).
-
Step. Set m = (a+b)/2; replace the endpoint whose sign
matches f(m), so the surviving half still straddles zero.
-
Linear convergence. After n steps the bracket has width
b_n - a_n = \frac{b_0 - a_0}{2^{\,n}},
so the error falls like 2^{-n} — you win one binary digit
(one bit) per step, about one decimal digit every
\log_2 10 \approx 3.32 steps. Unspectacular, but
guaranteed.
Let us find \sqrt 2 by bisecting f(x) = x^2 - 2
on [0,2], where f(0) = -2 < 0 and
f(2) = 2 > 0. Press Run and watch the interval collapse,
the error column shedding one bit each line:
function f(x: number): number {
return x * x - 2; // the positive root is x = √2
}
let a = 0, b = 2; // f(a) < 0 and f(b) > 0: a root is trapped inside
console.log("step | a b | width | error");
for (let n = 1; n <= 12; n++) {
const m = (a + b) / 2; // midpoint of the current bracket
if (f(a) * f(m) < 0) {
b = m; // sign change lies in [a, m]: keep the left half
} else {
a = m; // otherwise the root is in [m, b]: keep the right half
}
const mid = (a + b) / 2;
const err = Math.abs(mid - Math.SQRT2);
console.log(`${String(n).padStart(3)} | ${a.toFixed(6)} ${b.toFixed(6)} | ${(b - a).toFixed(6)} | ${err.toExponential(2)}`);
}
console.log(`\nroot ≈ ${((a + b) / 2).toFixed(8)} (√2 = ${Math.SQRT2.toFixed(8)})`);
The width marches down 2,\,1,\,0.5,\,0.25,\ldots — a perfect geometric
halving — and the error trails it, gaining a single bit each row and never once faltering. That
reliability has a cost: to squeeze the starting width of 2 down to a
tolerance of 10^{-12} you need
n \ge \log_2(2/10^{-12}) \approx 41 steps. Forty-one evaluations for twelve
digits is dull work. Surely we can use more of what the function is telling us than just its sign?
Newton–Raphson: ride the tangent to the axis
Bisection throws away almost everything about f — it looks only at the
sign, never the steepness. But the slope of the curve points, quite literally, toward the
root. Newton's method uses it. Stand at your current guess
x_n, draw the tangent line to the graph there, and follow
that line down to where it meets the axis. Because a smooth curve hugs its tangent nearby, that
landing point is usually a far better guess than where you started.
Turning the picture into algebra needs one ingredient: the
derivative
f'(x_n), which is the slope of that tangent. The tangent line at
x_n passes through \bigl(x_n, f(x_n)\bigr) with
slope f'(x_n), so its equation is
y = f(x_n) + f'(x_n)\,(x - x_n).
Set y = 0 and solve for the crossing x = x_{n+1}.
Rearranging 0 = f(x_n) + f'(x_n)(x_{n+1} - x_n) gives the whole method in a
single line:
-
The update.
x_{n+1} = x_n - \frac{f(x_n)}{f'(x_n)}.
-
Quadratic convergence. Near a simple root the error obeys
|e_{n+1}| \approx C\,|e_n|^2, so the number of correct digits
roughly doubles every step — 2 \to 4 \to 8 \to 16 — and
a handful of iterations reaches machine precision.
-
The price. You must supply the derivative f', and you
must start close enough. Where f'(x_n) \approx 0 the step
f/f' blows up, and from a poor start the iteration can overshoot,
diverge, or cycle.
Take the same problem, f(x) = x^2 - 2 with
f'(x) = 2x. The update becomes
x_{n+1} = x_n - \frac{x_n^2 - 2}{2x_n} = \frac{1}{2}\!\left(x_n + \frac{2}{x_n}\right),
which you may recognise as the ancient Babylonian square-root rule: "average your guess with
2/x_n." Start from x_0 = 2 and watch the error
nose-dive — each row roughly squares the last:
function f(x: number): number { return x * x - 2; }
function df(x: number): number { return 2 * x; } // the derivative f'(x) = 2x
let x = 2; // starting guess x0
console.log("step | x | error");
for (let n = 1; n <= 6; n++) {
x = x - f(x) / df(x); // slide down the tangent to the axis
const err = Math.abs(x - Math.SQRT2);
console.log(`${String(n).padStart(3)} | ${x.toFixed(15)} | ${err.toExponential(2)}`);
}
console.log(`\n√2 = ${Math.SQRT2.toFixed(15)}`);
Read the error column as a story of doubling precision: it falls roughly
10^{-1} \to 10^{-2} \to 10^{-5} \to 10^{-10} \to essentially the machine's
floor. Where bisection needed about 41 steps to reach
10^{-12}, Newton is already past it by step
5. That gulf — linear versus quadratic
convergence — is the whole reason Newton dominates whenever a derivative is available and the start is
decent.
A one-line Taylor argument. Let e_n = x_n - x^\* be the error. Expand
f about the root x^\* (where
f(x^\*)=0):
f(x_n) = f'(x^\*)e_n + \tfrac12 f''(x^\*)e_n^2 + \cdots. Feed this and
f'(x_n) \approx f'(x^\*) into the update and the first-order terms cancel
exactly — that is the point of aiming at the tangent's root — leaving
e_{n+1} \approx \frac{f''(x^\*)}{2\,f'(x^\*)}\;e_n^{\,2}.
The new error is proportional to the square of the old one, which is precisely "the correct
digits double." Notice the catch hiding in the denominator: if
f'(x^\*) = 0 — a repeated root, where the curve only
touches the axis — the cancellation is spoiled and Newton drops back to merely linear speed.
A second worked example: a cubic with no nice root
Square roots are almost too friendly. Try f(x) = x^3 - x - 2 = 0, a cubic
whose one real root is an ugly irrational near 1.52 (the closed-form
Cardano expression exists but is a horror of nested radicals — nobody sane evaluates it by hand). Its
derivative is f'(x) = 3x^2 - 1, so Newton reads
x_{n+1} = x_n - \frac{x_n^3 - x_n - 2}{3x_n^2 - 1}.
We add a proper stopping rule this time — quit when the residual
|f(x)| is below a tolerance, and keep a
maximum-iteration guard so a misbehaving run can never loop forever:
function f(x: number): number { return x * x * x - x - 2; }
function df(x: number): number { return 3 * x * x - 1; }
const tol = 1e-14; // stop when |f(x)| is this small
const maxIter = 50; // guard against a run that never settles
let x = 1.5; // a reasonable starting guess
let n = 0;
while (Math.abs(f(x)) > tol && n < maxIter) {
n++;
x = x - f(x) / df(x);
console.log(`step ${String(n).padStart(2)}: x = ${x.toFixed(15)} |f(x)| = ${Math.abs(f(x)).toExponential(2)}`);
}
console.log(`\nconverged to x ≈ ${x.toFixed(12)} in ${n} steps`);
console.log(`check: x^3 − x − 2 = ${(x * x * x - x - 2).toExponential(2)}`);
Four or five steps, twelve correct digits, and a residual indistinguishable from zero — for a number
no formula will politely hand you. This little loop, with its tolerance test and iteration cap, is the
skeleton of essentially every production root-finder you will ever use.
Newton's Achilles heel is that it demands f'. When the derivative is
expensive, unavailable, or you are simply too lazy, replace the true tangent with the
secant through your two most recent points — approximating the slope by a finite
difference:
x_{n+1} = x_n - f(x_n)\,\frac{x_n - x_{n-1}}{f(x_n) - f(x_{n-1})}.
This secant method needs no derivative and converges at rate
\varphi \approx 1.618 (yes, the golden ratio) — slower than Newton's
2 but far faster than bisection's 1, and each
step costs only one new function evaluation. It is the pragmatic middle child of the family.
Knowing when to stop
An iteration produces an endless stream of guesses; you must decide when one is good enough.
There is no single right answer, so a robust solver combines several tests:
-
Small residual: stop when |f(x_n)| < \tau_f. The
catch — a nearly-flat function can have a small residual far from its root, so this alone can lie.
-
Small step / interval: stop when successive guesses barely move,
|x_{n+1} - x_n| < \tau_x, or (for bisection) when the bracket width
b - a drops below \tau_x. This directly bounds
how far you can be from the root.
-
Iteration guard: always cap the loop at some N_{\max}.
It costs nothing when things go well and is your only defence when they do not — a divergent or
cycling Newton run must never become an infinite loop.
Push the tolerances too tight and you smash into the wall from the previous lesson:
|f(x)| can never get meaningfully below the floating-point noise floor near
the root, so asking for a tolerance smaller than a few times machine epsilon just spins the loop
forever. Sensible defaults are a tolerance a modest multiple of
\varepsilon_{\text{mach}} and a generous but finite iteration cap.
Quadratic speed comes with sharp teeth. Newton offers no guarantee of convergence, and it
fails in more ways than beginners expect:
-
Flat-tangent overshoot. If a guess lands where
f'(x_n) \approx 0, the step f/f' divides by
almost nothing and flings the next guess to the far side of the number line.
-
Divergence from a bad start. Too far from the root, the tangent can point the
wrong way entirely — the classic f(x) = \sqrt[3]{x} from any
x_0 \ne 0 marches steadily away from its root, the error
growing by a factor of two each step.
-
Cycling. Newton can fall into a closed loop and orbit forever. For
f(x) = x^3 - 2x + 2 started at x_0 = 0 the
iterates bounce 0 \to 1 \to 0 \to 1 \to \cdots, never nearing the true
root. Run it and watch:
function f(x: number): number { return x * x * x - 2 * x + 2; }
function df(x: number): number { return 3 * x * x - 2; }
let x = 0; // the notorious cycling start
for (let n = 1; n <= 8; n++) {
x = x - f(x) / df(x);
console.log(`step ${n}: x = ${x.toFixed(4)}`); // 1, 0, 1, 0, … forever
}
console.log("trapped in a 0 → 1 → 0 → 1 cycle; it never converges");
And bisection has its own blind spot, the mirror image of these: it cannot find a
double root that only touches the axis without crossing it — like
f(x) = x^2 at the origin — because there is no sign change to bracket. The
professional's answer is a hybrid (Brent's method is the famous one): take Newton's
fast steps while they behave, and fall back to a safe bisection step the instant they misbehave —
the speed of the hare with the safety net of the tortoise.