Iterative Methods and Conditioning

LU decomposition solves A\mathbf{x}=\mathbf{b} exactly (bar rounding) in about \tfrac{2}{3}n^3 operations. For a 3\times3 system that is nothing. But the matrices that scientific computing actually cares about are not 3\times3. Discretise the heat equation on a modest 1000\times1000 grid and you get a linear system with n = 10^6 unknowns. Then \tfrac{2}{3}n^3 \approx 6\times10^{17} floating-point operations — hours on a supercomputer for a single solve — and the factorisation would need n^2 = 10^{12} numbers of storage, terabytes of it. Direct elimination is simply hopeless at this scale.

Yet that giant matrix is almost entirely zeros. Each grid point talks only to its handful of neighbours, so every row of A has maybe five non-zero entries out of a million. A matrix that is mostly zeros is called sparse, and multiplying a vector by it costs only O(n), not O(n^2). The tragedy of elimination is that it fills in those zeros as it works, destroying the sparsity that made the problem cheap in the first place. This page teaches the alternative: don't solve the system — guess, then improve the guess, using nothing but cheap matrix–vector products. And it teaches the number that decides whether the answer you get can be trusted at all: the condition number.

The idea: split the matrix and iterate

Every iterative method for A\mathbf{x}=\mathbf{b} starts from one algebraic trick. Split A into two pieces,

A = M - N,

where M is chosen to be something we can invert trivially — a diagonal matrix, or a triangular one. Substituting into A\mathbf{x}=\mathbf{b} gives M\mathbf{x} = N\mathbf{x} + \mathbf{b}, which is exact but circular: \mathbf{x} sits on both sides. Break the circle by putting an old guess on the right and reading off a new guess on the left:

\mathbf{x}_{k+1} = M^{-1}\bigl(N\mathbf{x}_k + \mathbf{b}\bigr).

Start from any \mathbf{x}_0 (all zeros will do), and hope the sequence \mathbf{x}_0, \mathbf{x}_1, \mathbf{x}_2, \dots marches toward the true solution \mathbf{x}. Because M is easy to invert and the only other cost is one product with the sparse N, each step is O(n). If we reach acceptable accuracy in a modest number of steps, we have beaten the n^3 monster handily. Everything now hinges on the choice of M — and on whether the iteration converges at all.

Jacobi: everybody updates at once

The simplest choice takes M = D, the diagonal of A, and leaves N = D - A as everything off the diagonal (negated). Inverting a diagonal matrix is just dividing by each diagonal entry, so the update unpacks, row by row, into a formula you could run by hand:

x_i^{(k+1)} = \frac{1}{a_{ii}}\left(b_i - \sum_{j \ne i} a_{ij}\, x_j^{(k)}\right).

Read it aloud: to update the i-th unknown, take its equation, move everyone else to the right using their old values, and divide by the diagonal coefficient. The defining feature of the Jacobi method is that every x_j^{(k)} on the right is from the previous sweep — all n components update simultaneously from the same old vector. That makes it beautifully parallel (each processor owns one unknown and needs only last round's answers), but it also means it ignores information it already has. Let's watch it converge on a small diagonally-dominant system, printing the residual \lVert \mathbf{b} - A\mathbf{x}_k \rVert after each sweep:

// Jacobi iteration on a diagonally-dominant 3x3 system. // True solution is x = [1, 2, -1]. const A: number[][] = [ [10, -1, 2], [-1, 11, -1], [ 2, -1, 10], ]; const b: number[] = [6, 22, -10]; const n = 3; // residual norm ||b - A x||, our measure of "how wrong" function residual(x: number[]): number { let s = 0; for (let i = 0; i < n; i++) { let r = b[i]; for (let j = 0; j < n; j++) r -= A[i][j] * x[j]; s += r * r; } return Math.sqrt(s); } let x: number[] = [0, 0, 0]; console.log(`sweep 0: x = [${x.join(", ")}] residual = ${residual(x).toFixed(6)}`); for (let sweep = 1; sweep <= 8; sweep++) { const xNew: number[] = new Array(n).fill(0); for (let i = 0; i < n; i++) { let s = b[i]; for (let j = 0; j < n; j++) { if (j !== i) s -= A[i][j] * x[j]; // uses OLD x[j] throughout } xNew[i] = s / A[i][i]; } x = xNew; // commit all updates at once console.log(`sweep ${sweep}: x = [${x.map((v) => v.toFixed(4)).join(", ")}] residual = ${residual(x).toExponential(3)}`); }

The residual shrinks by a roughly constant factor every sweep — a straight line on a log scale, the signature of linear convergence. After eight sweeps we are close to [1, 2, -1]. But we left performance on the table: when we computed x_2 we still leaned on the stale x_1, even though a fresh, better x_1 was already sitting right there.

Gauss–Seidel: use the news the instant it arrives

One tiny change and the method gets noticeably better. When you compute x_i^{(k+1)}, the components x_1, \dots, x_{i-1} for this sweep have already been updated. Why use their old values? The Gauss–Seidel method uses each freshly-computed value the moment it is available:

x_i^{(k+1)} = \frac{1}{a_{ii}}\left(b_i - \sum_{j < i} a_{ij}\, x_j^{(k+1)} - \sum_{j > i} a_{ij}\, x_j^{(k)}\right).

The first sum uses new values, the second still uses old ones. In splitting language this is M = D + L — the whole lower triangle of A (diagonal plus the entries below it) — with N = -U the strict upper triangle. Inverting a lower-triangular M is just forward substitution, still cheap. The payoff: for the systems where both work, Gauss–Seidel typically converges in about half as many sweeps as Jacobi, because it folds better information into every single update. The price is that it is inherently sequential — x_i must wait for x_{i-1} — so it parallelises less naturally. Run the same system and compare:

// Gauss-Seidel on the same system: update x[i] IN PLACE so later // components in the SAME sweep see the fresh values immediately. const A: number[][] = [ [10, -1, 2], [-1, 11, -1], [ 2, -1, 10], ]; const b: number[] = [6, 22, -10]; const n = 3; function residual(x: number[]): number { let s = 0; for (let i = 0; i < n; i++) { let r = b[i]; for (let j = 0; j < n; j++) r -= A[i][j] * x[j]; s += r * r; } return Math.sqrt(s); } const x: number[] = [0, 0, 0]; console.log(`sweep 0: residual = ${residual(x).toFixed(6)}`); for (let sweep = 1; sweep <= 8; sweep++) { for (let i = 0; i < n; i++) { let s = b[i]; for (let j = 0; j < n; j++) { if (j !== i) s -= A[i][j] * x[j]; // x[j] for j<i is ALREADY this sweep's value } x[i] = s / A[i][i]; // overwrite immediately } console.log(`sweep ${sweep}: x = [${x.map((v) => v.toFixed(4)).join(", ")}] residual = ${residual(x).toExponential(3)}`); }

Same matrix, same starting guess, same cost per sweep — but the residual falls off a cliff, reaching in four or five sweeps what Jacobi needed eight to do. The only difference is a single word in the code: xNew became x. Using information the instant you have it is, quite literally, twice as good here.

Splitting. Write A = M - N and iterate \mathbf{x}_{k+1} = M^{-1}(N\mathbf{x}_k + \mathbf{b}). Jacobi takes M = D (the diagonal); Gauss–Seidel takes M = D + L (the lower triangle), reusing updated components at once.
Convergence condition. The iteration converges to the true solution for every starting guess if and only if the spectral radius of the iteration matrix satisfies \rho\!\left(M^{-1}N\right) < 1, where \rho is the largest absolute eigenvalue. The error is multiplied by (roughly) \rho each sweep, so smaller \rho means faster convergence — and \rho \ge 1 means it diverges.
A practical sufficient test. If A is strictly diagonally dominant — |a_{ii}| > \sum_{j \ne i} |a_{ij}| for every row — then both Jacobi and Gauss–Seidel are guaranteed to converge, from any start. (It is sufficient, not necessary: some non-dominant systems still converge.)

The chart: Jacobi versus Gauss–Seidel

Put the two residual histories side by side on a log scale and the story is unmistakable. Both are straight lines — linear convergence — but Gauss–Seidel's line is about twice as steep. Each sweep costs the same, yet Gauss–Seidel buys roughly double the accuracy per step, so it reaches any target in about half the sweeps.

Yes — Gauss–Seidel is just the beginning. Successive over-relaxation (SOR) takes the Gauss–Seidel step and deliberately overshoots it by a factor \omega \in (1, 2), and with the right \omega it can turn hundreds of sweeps into dozens. For the symmetric positive-definite systems that dominate physics, the conjugate gradient method does better still, and modern Krylov subspace and multigrid solvers can drive a million-unknown PDE system to machine accuracy in a handful of sweeps. All of them share the DNA on this page: cheap sparse matrix–vector products, a good guess, and a splitting that shrinks the error each step.

A direct solver like LU always returns an answer. An iterative method makes no such promise: if \rho(M^{-1}N) \ge 1, the iteration runs away to infinity instead of converging, and no amount of extra sweeps will save it. Diagonal dominance is the easy insurance policy — it forces \rho < 1 — but a system that is not diagonally dominant can have \rho > 1 and blow up. Watch it happen: take the same equations but list the rows in the wrong order, so the diagonal is tiny, and Jacobi's residual grows without bound.

// Same three equations, but the rows are ordered so the DIAGONAL is small // (not diagonally dominant). Jacobi now DIVERGES. const A: number[][] = [ [ 2, -1, 10], // huge off-diagonal, tiny diagonal: dominance is broken [-1, 11, -1], [10, -1, 2], ]; const b: number[] = [-10, 22, 6]; const n = 3; function residual(x: number[]): number { let s = 0; for (let i = 0; i < n; i++) { let r = b[i]; for (let j = 0; j < n; j++) r -= A[i][j] * x[j]; s += r * r; } return Math.sqrt(s); } let x: number[] = [0, 0, 0]; for (let sweep = 1; sweep <= 8; sweep++) { const xNew: number[] = new Array(n).fill(0); for (let i = 0; i < n; i++) { let s = b[i]; for (let j = 0; j < n; j++) if (j !== i) s -= A[i][j] * x[j]; xNew[i] = s / A[i][i]; } x = xNew; console.log(`sweep ${sweep}: residual = ${residual(x).toExponential(3)}`); // grows! } console.log("Broken dominance -> the residual explodes instead of shrinking.");

The condition number: how much can the answer be trusted?

Suppose you solve A\mathbf{x}=\mathbf{b} and get a residual that is tiny — surely the answer is good? Not necessarily. Whether a small residual (or a small error in \mathbf{b}) means a small error in \mathbf{x} depends on a single property of the matrix: its condition number. This is the promised sequel to the conditioning teaser from floating-point and error — made precise for linear systems.

Definition. For an invertible matrix, \kappa(A) = \lVert A \rVert \cdot \lVert A^{-1} \rVert \;\ge\; 1. In the 2-norm this equals the ratio of the largest to smallest singular value, \kappa_2(A) = \sigma_{\max}/\sigma_{\min}; for a symmetric matrix it is the ratio of extreme eigenvalues, \kappa_2(A) = |\lambda_{\max}/\lambda_{\min}|.
Error amplification. A relative perturbation of the data is magnified in the answer by up to \kappa(A): \frac{\lVert \Delta\mathbf{x} \rVert}{\lVert \mathbf{x} \rVert} \;\le\; \kappa(A)\,\frac{\lVert \Delta\mathbf{b} \rVert}{\lVert \mathbf{b} \rVert}. A small relative residual only guarantees a small error when \kappa(A) is modest.
Rule of thumb. You lose about \log_{10}\kappa(A) significant decimal digits to conditioning alone. With \kappa = 10^{8} and double precision's ~16 digits, expect only ~8 correct digits — before your algorithm adds any error of its own.

A matrix with \kappa near 1 is well-conditioned — errors pass through undramatised. A matrix with huge \kappa (a nearly singular matrix, whose rows are almost linearly dependent) is ill-conditioned: it sits on a knife-edge where a whisker of change in the input swings the solution wildly. Here is that lever in action — a barely-singular 2\times2 system, nudged by one part in a thousand on the right-hand side:

// A nearly-singular 2x2: its two rows are ALMOST parallel. // [ 1.000 1.000 ] [x1] [ 2.000 ] // [ 1.000 1.001 ] [x2] = [ 2.000 ] const a11 = 1.0, a12 = 1.0; const a21 = 1.0, a22 = 1.001; // Solve a 2x2 by Cramer's rule: x = A^{-1} b. function solve2(b1: number, b2: number): [number, number] { const det = a11 * a22 - a12 * a21; // tiny: 0.001 const x1 = ( b1 * a22 - a12 * b2) / det; const x2 = (a11 * b2 - b1 * a21) / det; return [x1, x2]; } const [x1, x2] = solve2(2.000, 2.000); const [y1, y2] = solve2(2.000, 2.002); // b changed by 0.002 in one entry console.log(`b = [2.000, 2.000] -> x = [${x1.toFixed(3)}, ${x2.toFixed(3)}]`); console.log(`b = [2.000, 2.002] -> x = [${y1.toFixed(3)}, ${y2.toFixed(3)}]`); // A 0.1% wiggle in b produced a gigantic swing in x. const relB = Math.abs(0.002 / 2.000); const relX = Math.abs(y2 - x2) / Math.max(1e-12, Math.abs(x2 === 0 ? 1 : x2 + 1)); console.log(`relative change in b ~ ${relB.toExponential(2)}`); console.log(`change in x2 = ${(y2 - x2).toFixed(3)} (from a 0.1% input nudge!)`); console.log(`amplification factor ~ ${Math.abs((y2 - x2) / 0.002).toFixed(0)}x -- this is kappa at work`);

A 0.1\% tap on \mathbf{b} throws x_2 by a full 2 — an amplification of about a thousand, which is precisely \kappa(A) \approx 2000 for this matrix. The residual of the wrong answer would look reassuringly small; the error is enormous. That gap between residual and error is the condition number.

The most seductive mistake in numerical linear algebra is to check \lVert \mathbf{b} - A\hat{\mathbf{x}} \rVert, see that it is tiny, and declare victory. For an ill-conditioned matrix that is a trap. The residual measures whether \hat{\mathbf{x}} nearly satisfies the equation; the error measures whether it is nearly the true solution. The two are linked by

\frac{\lVert \hat{\mathbf{x}} - \mathbf{x} \rVert}{\lVert \mathbf{x} \rVert} \;\le\; \kappa(A)\,\frac{\lVert \mathbf{b} - A\hat{\mathbf{x}} \rVert}{\lVert \mathbf{b} \rVert}.

When \kappa(A) = 10^{6}, a residual at the level of machine epsilon (\sim 10^{-16}) still permits an error near 10^{-10} — and if the residual is merely "small" at 10^{-6}, your answer may have no correct digits at all. Always ask for \kappa, not just the residual, before you trust a solve.

Putting it together

Two threads, one loom. Iterative methods answer how we solve the enormous sparse systems that direct elimination cannot touch: split A = M - N, iterate cheaply, and lean on \rho(M^{-1}N) < 1 (guaranteed by diagonal dominance) for convergence. The condition number answers whether the answer — from any method, iterative or direct — can be believed: \kappa(A) bounds how badly errors in the data are magnified, and \log_{10}\kappa counts the digits you forfeit before you even begin. A great numerical solve needs both: a fast-converging iteration and a well-conditioned matrix. Miss either, and the machine will hand you a confident, precise, thoroughly wrong number.