Tikhonov Regularization
Naïve inversion fails because it trusts the data completely, even where the operator is blind.
Tikhonov regularization strikes a bargain: fit the data and keep the
model small. It minimises a sum of two terms,
\hat m_\alpha = \arg\min_m \Big( \|Gm - d\|^2 + \alpha^2\,\|m\|^2 \Big).
The first term is the
data misfit;
the second is a penalty on the size of the model, weighted by the
regularization parameter \alpha. Setting the gradient
to zero gives a modified normal equation that is always solvable:
(G^{\mathsf T}G + \alpha^2 I)\,\hat m_\alpha = G^{\mathsf T}d.
Adding \alpha^2 I lifts every eigenvalue away from zero — the singular
matrix becomes invertible. This is exactly
ridge regression
from machine learning, here cast as the cure for ill-posedness.
What it does to each singular value
Through the SVD, Tikhonov multiplies each reconstructed component by a filter factor:
\hat m_\alpha = \sum_i f_i\,\frac{u_i^{\mathsf T}d}{\sigma_i}\,v_i, \qquad f_i = \frac{\sigma_i^2}{\sigma_i^2 + \alpha^2}.
Look at f_i. Where \sigma_i \gg \alpha
(strong, trustworthy directions) f_i \approx 1 — kept intact. Where
\sigma_i \ll \alpha (weak, noise-prone directions)
f_i \approx \sigma_i^2/\alpha^2 \approx 0 — smoothly switched off. The
dangerous 1/\sigma_i blow-up is tamed: instead of dividing by a tiny
number, the filter sends that term to zero.
The filter in action
The curve is the filter factor f_i across singular directions (high
index = small \sigma_i). Turn up \alpha and
the cutoff marches to lower indices — more aggressive smoothing, suppressing more directions. Turn
it down toward zero and every factor approaches 1, recovering the unstable naïve inverse.
Choosing \alpha is choosing where to draw that line.
- Minimise \|Gm-d\|^2 + \alpha^2\|m\|^2; solve (G^{\mathsf T}G + \alpha^2 I)m = G^{\mathsf T}d.
- Filter factors f_i = \sigma_i^2/(\sigma_i^2 + \alpha^2): ≈1 for strong directions, ≈0 for weak ones.
- It is ridge regression, and it trades data fit against model size via \alpha.