Tikhonov Regularization

Naïve inversion fails because it trusts the data completely, even where the operator is blind. Tikhonov regularization strikes a bargain: fit the data and keep the model small. It minimises a sum of two terms,

\hat m_\alpha = \arg\min_m \Big( \|Gm - d\|^2 + \alpha^2\,\|m\|^2 \Big).

The first term is the data misfit; the second is a penalty on the size of the model, weighted by the regularization parameter \alpha. Setting the gradient to zero gives a modified normal equation that is always solvable:

(G^{\mathsf T}G + \alpha^2 I)\,\hat m_\alpha = G^{\mathsf T}d.

Adding \alpha^2 I lifts every eigenvalue away from zero — the singular matrix becomes invertible. This is exactly ridge regression from machine learning, here cast as the cure for ill-posedness.

What it does to each singular value

Through the SVD, Tikhonov multiplies each reconstructed component by a filter factor:

\hat m_\alpha = \sum_i f_i\,\frac{u_i^{\mathsf T}d}{\sigma_i}\,v_i, \qquad f_i = \frac{\sigma_i^2}{\sigma_i^2 + \alpha^2}.

Look at f_i. Where \sigma_i \gg \alpha (strong, trustworthy directions) f_i \approx 1 — kept intact. Where \sigma_i \ll \alpha (weak, noise-prone directions) f_i \approx \sigma_i^2/\alpha^2 \approx 0 — smoothly switched off. The dangerous 1/\sigma_i blow-up is tamed: instead of dividing by a tiny number, the filter sends that term to zero.

The filter in action

The curve is the filter factor f_i across singular directions (high index = small \sigma_i). Turn up \alpha and the cutoff marches to lower indices — more aggressive smoothing, suppressing more directions. Turn it down toward zero and every factor approaches 1, recovering the unstable naïve inverse. Choosing \alpha is choosing where to draw that line.

Minimise \|Gm-d\|^2 + \alpha^2\|m\|^2; solve (G^{\mathsf T}G + \alpha^2 I)m = G^{\mathsf T}d.
Filter factors f_i = \sigma_i^2/(\sigma_i^2 + \alpha^2): ≈1 for strong directions, ≈0 for weak ones.
It is ridge regression, and it trades data fit against model size via \alpha.