Iterative Methods and Conditioning
LU decomposition
solves A\mathbf{x}=\mathbf{b} exactly (bar rounding) in about
\tfrac{2}{3}n^3 operations. For a 3\times3 system
that is nothing. But the matrices that scientific computing actually cares about are not
3\times3. Discretise the heat equation on a modest
1000\times1000 grid and you get a linear system with
n = 10^6 unknowns. Then \tfrac{2}{3}n^3 \approx 6\times10^{17}
floating-point operations — hours on a supercomputer for a single solve — and the
factorisation would need n^2 = 10^{12} numbers of storage, terabytes of it.
Direct elimination is simply hopeless at this scale.
Yet that giant matrix is almost entirely zeros. Each grid point talks only to its
handful of neighbours, so every row of A has maybe five non-zero entries out
of a million. A matrix that is mostly zeros is called sparse, and multiplying a vector
by it costs only O(n), not O(n^2). The tragedy of
elimination is that it fills in those zeros as it works, destroying the sparsity that made the
problem cheap in the first place. This page teaches the alternative: don't solve the system —
guess, then improve the guess, using nothing but cheap matrix–vector products. And it
teaches the number that decides whether the answer you get can be trusted at all: the
condition number.
The idea: split the matrix and iterate
Every iterative method for A\mathbf{x}=\mathbf{b} starts from one algebraic
trick. Split A into two pieces,
A = M - N,
where M is chosen to be something we can invert trivially — a
diagonal matrix, or a triangular one. Substituting into
A\mathbf{x}=\mathbf{b} gives
M\mathbf{x} = N\mathbf{x} + \mathbf{b}, which is exact but circular:
\mathbf{x} sits on both sides. Break the circle by putting an
old guess on the right and reading off a new guess on the left:
\mathbf{x}_{k+1} = M^{-1}\bigl(N\mathbf{x}_k + \mathbf{b}\bigr).
Start from any \mathbf{x}_0 (all zeros will do), and hope the sequence
\mathbf{x}_0, \mathbf{x}_1, \mathbf{x}_2, \dots marches toward the true
solution \mathbf{x}. Because M is easy to invert
and the only other cost is one product with the sparse N, each step is
O(n). If we reach acceptable accuracy in a modest number of steps, we have
beaten the n^3 monster handily. Everything now hinges on the choice of
M — and on whether the iteration converges at all.
Jacobi: everybody updates at once
The simplest choice takes M = D, the diagonal of
A, and leaves N = D - A as everything off the
diagonal (negated). Inverting a diagonal matrix is just dividing by each diagonal entry, so the update
unpacks, row by row, into a formula you could run by hand:
x_i^{(k+1)} = \frac{1}{a_{ii}}\left(b_i - \sum_{j \ne i} a_{ij}\, x_j^{(k)}\right).
Read it aloud: to update the i-th unknown, take its equation, move
everyone else to the right using their old values, and divide by the diagonal
coefficient. The defining feature of the Jacobi method is that every
x_j^{(k)} on the right is from the previous sweep — all
n components update simultaneously from the same old vector. That
makes it beautifully parallel (each processor owns one unknown and needs only last round's answers), but
it also means it ignores information it already has. Let's watch it converge on a small
diagonally-dominant system, printing the residual
\lVert \mathbf{b} - A\mathbf{x}_k \rVert after each sweep:
// Jacobi iteration on a diagonally-dominant 3x3 system.
// True solution is x = [1, 2, -1].
const A: number[][] = [
[10, -1, 2],
[-1, 11, -1],
[ 2, -1, 10],
];
const b: number[] = [6, 22, -10];
const n = 3;
// residual norm ||b - A x||, our measure of "how wrong"
function residual(x: number[]): number {
let s = 0;
for (let i = 0; i < n; i++) {
let r = b[i];
for (let j = 0; j < n; j++) r -= A[i][j] * x[j];
s += r * r;
}
return Math.sqrt(s);
}
let x: number[] = [0, 0, 0];
console.log(`sweep 0: x = [${x.join(", ")}] residual = ${residual(x).toFixed(6)}`);
for (let sweep = 1; sweep <= 8; sweep++) {
const xNew: number[] = new Array(n).fill(0);
for (let i = 0; i < n; i++) {
let s = b[i];
for (let j = 0; j < n; j++) {
if (j !== i) s -= A[i][j] * x[j]; // uses OLD x[j] throughout
}
xNew[i] = s / A[i][i];
}
x = xNew; // commit all updates at once
console.log(`sweep ${sweep}: x = [${x.map((v) => v.toFixed(4)).join(", ")}] residual = ${residual(x).toExponential(3)}`);
}
The residual shrinks by a roughly constant factor every sweep — a straight line on a log scale, the
signature of linear convergence. After eight sweeps we are close to
[1, 2, -1]. But we left performance on the table: when we computed
x_2 we still leaned on the stale
x_1, even though a fresh, better x_1 was already
sitting right there.
Gauss–Seidel: use the news the instant it arrives
One tiny change and the method gets noticeably better. When you compute
x_i^{(k+1)}, the components x_1, \dots, x_{i-1} for
this sweep have already been updated. Why use their old values? The Gauss–Seidel
method uses each freshly-computed value the moment it is available:
x_i^{(k+1)} = \frac{1}{a_{ii}}\left(b_i - \sum_{j < i} a_{ij}\, x_j^{(k+1)} - \sum_{j > i} a_{ij}\, x_j^{(k)}\right).
The first sum uses new values, the second still uses old ones. In splitting language
this is M = D + L — the whole lower triangle of
A (diagonal plus the entries below it) — with
N = -U the strict upper triangle. Inverting a lower-triangular
M is just forward substitution, still cheap. The payoff: for the systems
where both work, Gauss–Seidel typically converges in about half as many sweeps as
Jacobi, because it folds better information into every single update. The price is that it is inherently
sequential — x_i must wait for x_{i-1} —
so it parallelises less naturally. Run the same system and compare:
// Gauss-Seidel on the same system: update x[i] IN PLACE so later
// components in the SAME sweep see the fresh values immediately.
const A: number[][] = [
[10, -1, 2],
[-1, 11, -1],
[ 2, -1, 10],
];
const b: number[] = [6, 22, -10];
const n = 3;
function residual(x: number[]): number {
let s = 0;
for (let i = 0; i < n; i++) {
let r = b[i];
for (let j = 0; j < n; j++) r -= A[i][j] * x[j];
s += r * r;
}
return Math.sqrt(s);
}
const x: number[] = [0, 0, 0];
console.log(`sweep 0: residual = ${residual(x).toFixed(6)}`);
for (let sweep = 1; sweep <= 8; sweep++) {
for (let i = 0; i < n; i++) {
let s = b[i];
for (let j = 0; j < n; j++) {
if (j !== i) s -= A[i][j] * x[j]; // x[j] for j<i is ALREADY this sweep's value
}
x[i] = s / A[i][i]; // overwrite immediately
}
console.log(`sweep ${sweep}: x = [${x.map((v) => v.toFixed(4)).join(", ")}] residual = ${residual(x).toExponential(3)}`);
}
Same matrix, same starting guess, same cost per sweep — but the residual falls off a cliff, reaching in
four or five sweeps what Jacobi needed eight to do. The only difference is a single word in the code:
xNew became x. Using information the instant you have it is, quite literally,
twice as good here.
-
Splitting. Write A = M - N and iterate
\mathbf{x}_{k+1} = M^{-1}(N\mathbf{x}_k + \mathbf{b}). Jacobi takes
M = D (the diagonal); Gauss–Seidel takes
M = D + L (the lower triangle), reusing updated components at once.
-
Convergence condition. The iteration converges to the true solution for
every starting guess if and only if the spectral radius of the
iteration matrix satisfies
\rho\!\left(M^{-1}N\right) < 1,
where \rho is the largest absolute
eigenvalue.
The error is multiplied by (roughly) \rho each sweep, so smaller
\rho means faster convergence — and
\rho \ge 1 means it diverges.
-
A practical sufficient test. If A is
strictly diagonally dominant —
|a_{ii}| > \sum_{j \ne i} |a_{ij}| for every row — then both Jacobi and
Gauss–Seidel are guaranteed to converge, from any start. (It is sufficient, not
necessary: some non-dominant systems still converge.)
The chart: Jacobi versus Gauss–Seidel
Put the two residual histories side by side on a log scale and the story is unmistakable. Both are
straight lines — linear convergence — but Gauss–Seidel's line is about twice as steep.
Each sweep costs the same, yet Gauss–Seidel buys roughly double the accuracy per step, so it reaches
any target in about half the sweeps.
Yes — Gauss–Seidel is just the beginning. Successive over-relaxation (SOR) takes the
Gauss–Seidel step and deliberately overshoots it by a factor
\omega \in (1, 2), and with the right \omega it
can turn hundreds of sweeps into dozens. For the symmetric positive-definite systems that dominate
physics, the conjugate gradient method does better still, and modern
Krylov subspace and multigrid solvers can drive a million-unknown
PDE system to machine accuracy in a handful of sweeps. All of them share the DNA on this page:
cheap sparse matrix–vector products, a good guess, and a splitting that shrinks the error each step.
A direct solver like LU always returns an answer. An iterative method makes no such promise:
if \rho(M^{-1}N) \ge 1, the iteration runs away to infinity
instead of converging, and no amount of extra sweeps will save it. Diagonal dominance is the easy
insurance policy — it forces \rho < 1 — but a system that is not
diagonally dominant can have \rho > 1 and blow up. Watch it happen: take the
same equations but list the rows in the wrong order, so the diagonal is tiny, and Jacobi's
residual grows without bound.
// Same three equations, but the rows are ordered so the DIAGONAL is small
// (not diagonally dominant). Jacobi now DIVERGES.
const A: number[][] = [
[ 2, -1, 10], // huge off-diagonal, tiny diagonal: dominance is broken
[-1, 11, -1],
[10, -1, 2],
];
const b: number[] = [-10, 22, 6];
const n = 3;
function residual(x: number[]): number {
let s = 0;
for (let i = 0; i < n; i++) {
let r = b[i];
for (let j = 0; j < n; j++) r -= A[i][j] * x[j];
s += r * r;
}
return Math.sqrt(s);
}
let x: number[] = [0, 0, 0];
for (let sweep = 1; sweep <= 8; sweep++) {
const xNew: number[] = new Array(n).fill(0);
for (let i = 0; i < n; i++) {
let s = b[i];
for (let j = 0; j < n; j++) if (j !== i) s -= A[i][j] * x[j];
xNew[i] = s / A[i][i];
}
x = xNew;
console.log(`sweep ${sweep}: residual = ${residual(x).toExponential(3)}`); // grows!
}
console.log("Broken dominance -> the residual explodes instead of shrinking.");
The condition number: how much can the answer be trusted?
Suppose you solve A\mathbf{x}=\mathbf{b} and get a residual that is tiny —
surely the answer is good? Not necessarily. Whether a small residual (or a small error in
\mathbf{b}) means a small error in \mathbf{x}
depends on a single property of the matrix: its condition number. This is the
promised sequel to the conditioning teaser from
floating-point and
error — made precise for linear systems.
-
Definition. For an invertible matrix,
\kappa(A) = \lVert A \rVert \cdot \lVert A^{-1} \rVert \;\ge\; 1.
In the 2-norm this equals the ratio of the largest to smallest singular value,
\kappa_2(A) = \sigma_{\max}/\sigma_{\min}; for a symmetric matrix it is
the ratio of extreme eigenvalues,
\kappa_2(A) = |\lambda_{\max}/\lambda_{\min}|.
-
Error amplification. A relative perturbation of the data is magnified in the answer
by up to \kappa(A):
\frac{\lVert \Delta\mathbf{x} \rVert}{\lVert \mathbf{x} \rVert} \;\le\; \kappa(A)\,\frac{\lVert \Delta\mathbf{b} \rVert}{\lVert \mathbf{b} \rVert}.
A small relative residual only guarantees a small error when
\kappa(A) is modest.
-
Rule of thumb. You lose about \log_{10}\kappa(A)
significant decimal digits to conditioning alone. With
\kappa = 10^{8} and double precision's ~16 digits, expect only ~8 correct
digits — before your algorithm adds any error of its own.
A matrix with \kappa near 1 is
well-conditioned — errors pass through undramatised. A matrix with huge
\kappa (a nearly singular matrix, whose rows are almost linearly
dependent) is ill-conditioned: it sits on a knife-edge where a whisker of change in the
input swings the solution wildly. Here is that lever in action — a barely-singular
2\times2 system, nudged by one part in a thousand on the right-hand side:
// A nearly-singular 2x2: its two rows are ALMOST parallel.
// [ 1.000 1.000 ] [x1] [ 2.000 ]
// [ 1.000 1.001 ] [x2] = [ 2.000 ]
const a11 = 1.0, a12 = 1.0;
const a21 = 1.0, a22 = 1.001;
// Solve a 2x2 by Cramer's rule: x = A^{-1} b.
function solve2(b1: number, b2: number): [number, number] {
const det = a11 * a22 - a12 * a21; // tiny: 0.001
const x1 = ( b1 * a22 - a12 * b2) / det;
const x2 = (a11 * b2 - b1 * a21) / det;
return [x1, x2];
}
const [x1, x2] = solve2(2.000, 2.000);
const [y1, y2] = solve2(2.000, 2.002); // b changed by 0.002 in one entry
console.log(`b = [2.000, 2.000] -> x = [${x1.toFixed(3)}, ${x2.toFixed(3)}]`);
console.log(`b = [2.000, 2.002] -> x = [${y1.toFixed(3)}, ${y2.toFixed(3)}]`);
// A 0.1% wiggle in b produced a gigantic swing in x.
const relB = Math.abs(0.002 / 2.000);
const relX = Math.abs(y2 - x2) / Math.max(1e-12, Math.abs(x2 === 0 ? 1 : x2 + 1));
console.log(`relative change in b ~ ${relB.toExponential(2)}`);
console.log(`change in x2 = ${(y2 - x2).toFixed(3)} (from a 0.1% input nudge!)`);
console.log(`amplification factor ~ ${Math.abs((y2 - x2) / 0.002).toFixed(0)}x -- this is kappa at work`);
A 0.1\% tap on \mathbf{b} throws
x_2 by a full 2 — an amplification of about a
thousand, which is precisely \kappa(A) \approx 2000 for this matrix. The
residual of the wrong answer would look reassuringly small; the error is enormous. That gap
between residual and error is the condition number.
The most seductive mistake in numerical linear algebra is to check
\lVert \mathbf{b} - A\hat{\mathbf{x}} \rVert, see that it is tiny, and
declare victory. For an ill-conditioned matrix that is a trap. The residual measures whether
\hat{\mathbf{x}} nearly satisfies the equation; the error measures
whether it is nearly the true solution. The two are linked by
\frac{\lVert \hat{\mathbf{x}} - \mathbf{x} \rVert}{\lVert \mathbf{x} \rVert} \;\le\; \kappa(A)\,\frac{\lVert \mathbf{b} - A\hat{\mathbf{x}} \rVert}{\lVert \mathbf{b} \rVert}.
When \kappa(A) = 10^{6}, a residual at the level of machine epsilon
(\sim 10^{-16}) still permits an error near
10^{-10} — and if the residual is merely "small" at
10^{-6}, your answer may have no correct digits at all. Always ask
for \kappa, not just the residual, before you trust a solve.
Putting it together
Two threads, one loom. Iterative methods answer how we solve the enormous
sparse systems that direct elimination cannot touch: split
A = M - N, iterate cheaply, and lean on
\rho(M^{-1}N) < 1 (guaranteed by diagonal dominance) for convergence. The
condition number answers whether the answer — from any method, iterative or
direct — can be believed: \kappa(A) bounds how badly errors in the data are
magnified, and \log_{10}\kappa counts the digits you forfeit before you even
begin. A great numerical solve needs both: a fast-converging iteration and a well-conditioned
matrix. Miss either, and the machine will hand you a confident, precise, thoroughly wrong number.