Abstract Inner-Product Spaces
The dot product is a
quiet workhorse. Feed it two arrows in the plane and it hands back a single number that secretly
encodes two geometric facts at once: how long the vectors are, and what
angle sits between them. Length is just a vector dotted with itself; a right angle is
a dot product of zero. Almost everything geometric — distance, projection, "is this the closest
point?", "are these two things independent?" — is really a statement about dot products in disguise.
Here is the liberating idea of this page. The dot product is not the only gadget that can do
this job. If we write down just the handful of rules that make the dot product
behave the way it does, then anything obeying those rules earns the same geometry —
lengths, angles, right angles, projections — for free. That abstract gadget is called an
inner product, written \langle u, v\rangle, and a
vector space
equipped with one is an inner-product space. The payoff is enormous: with the right
inner product you can measure the "angle" between two functions, ask whether two sound waves
are "orthogonal", and do geometry in spaces of infinitely many dimensions — which is exactly the
stage on which quantum mechanics is played.
From dot product to a list of rules
Recall the ordinary dot product on \mathbb{R}^n:
\vec u \cdot \vec v = u_1 v_1 + u_2 v_2 + \cdots + u_n v_n. Stare at it and
you can read off the properties that actually make it useful. It is symmetric
(\vec u\cdot\vec v = \vec v\cdot\vec u); it is linear in
each slot (you can pull out scalars and split sums); and a vector dotted with itself,
\vec v\cdot\vec v = v_1^2+\cdots+v_n^2, is a sum of squares — never
negative, and zero only for the zero vector. Those three habits are all we need.
So we promote them from observations to axioms. An inner product on a real
vector space V is any rule \langle u,v\rangle
that assigns a real number to each pair of vectors and obeys:
-
Symmetry.
\langle u, v\rangle = \langle v, u\rangle for all
u, v.
-
Linearity in the first slot.
\langle \alpha u + \beta w,\, v\rangle = \alpha\langle u,v\rangle + \beta\langle w,v\rangle
(by symmetry this then holds in the second slot too — the product is bilinear).
-
Positive-definiteness.
\langle v, v\rangle \ge 0 for every v, with
equality \langle v,v\rangle = 0 only when
v = \vec 0.
That is the whole definition. Notice what is not in the list: we never mention components,
coordinates, or "multiply matching entries." Those were an accident of how the dot product happens to
be built. The axioms are what matter, and any operation satisfying them deserves the angle brackets
and all the geometry that follows.
Length and distance for free
The instant you have an inner product, you have a notion of length. Define the
norm of a vector as the square root of the vector with itself:
\lVert v\rVert = \sqrt{\langle v, v\rangle}.
Positive-definiteness is exactly what makes this legal — \langle v,v\rangle
is never negative, so the square root is real, and the only vector of length zero is
\vec 0. For the ordinary dot product this reproduces Pythagoras,
\lVert v\rVert = \sqrt{v_1^2 + \cdots + v_n^2}, but the definition works
in any inner-product space, even one made of functions. And once you can measure length, the
distance between two vectors is simply the length of their difference,
d(u,v) = \lVert u - v\rVert. Geometry, from three little rules.
Angle, and the meaning of "orthogonal"
In the plane the dot product satisfies
\vec u\cdot\vec v = \lVert\vec u\rVert\,\lVert\vec v\rVert\cos\theta. We
turn that around to define the angle between two vectors in any
inner-product space:
\cos\theta = \frac{\langle u, v\rangle}{\lVert u\rVert\,\lVert v\rVert}.
For this to make sense the right-hand side must land between -1 and
+1 — and it always does, guaranteed by the
Cauchy–Schwarz inequality
|\langle u,v\rangle| \le \lVert u\rVert\,\lVert v\rVert, which follows from
the three axioms alone. The single most useful special case drops out when
\theta = 90^\circ: since \cos 90^\circ = 0, two
vectors are orthogonal
exactly when their inner product vanishes,
u \perp v \quad\Longleftrightarrow\quad \langle u, v\rangle = 0.
This is the same clean test you already know for arrows — but now "perpendicular" means something
even for objects you cannot draw, like two functions, as long as we have said what
\langle\cdot,\cdot\rangle is.
Orthonormal bases: the best coordinates there are
A set of vectors e_1, e_2, \ldots is orthonormal if they
are mutually orthogonal and each has length one — compactly,
\langle e_i, e_j\rangle = \delta_{ij} = \begin{cases} 1 & i = j \\ 0 & i \neq j. \end{cases}
The symbol \delta_{ij} (the Kronecker delta) is just shorthand
for "1 on the diagonal, 0 off it." Orthonormal bases are the nicest coordinate systems in
mathematics, because reading off a vector's coordinates stops being a chore. In a general basis you
must solve a system of equations to find the components; in an orthonormal basis you just take an
inner product. If v = \sum_i c_i e_i, then
c_i = \langle v, e_i\rangle,
because every other term \langle e_j, e_i\rangle with
j\neq i is zero and the surviving one is
c_i\langle e_i,e_i\rangle = c_i. The coordinate is the projection
onto that axis. Lengths get easy too — \lVert v\rVert^2 = \sum_i c_i^2
(Parseval's identity), the abstract Pythagoras. This "project onto an orthonormal basis" move is,
almost literally, how a measurement works in quantum mechanics.
The big leap: functions as vectors
Here is where the abstraction pays off spectacularly. Take the space of (nice, real-valued) functions
on an interval [a,b]. You can add functions and scale them, so they form a
vector space. Now equip it with the inner product built from an integral:
\langle f, g\rangle = \int_a^b f(x)\,g(x)\,dx.
Check the axioms and they all hold. It is symmetric (multiplication commutes inside the integral);
it is linear in each slot (integration is linear); and
\langle f,f\rangle = \int_a^b f(x)^2\,dx \ge 0, an integral of a square,
which is zero only for the zero function. So this integral is a genuine inner product, and
every piece of geometry we built now applies to functions. The "length" of a function is
\lVert f\rVert = \sqrt{\int_a^b f^2\,dx} (its
root-mean-square size), and two functions are orthogonal when the signed
area under their product is zero.
The star example: on [-\pi,\pi], the functions
\sin x and \cos x are orthogonal, because over a
full period the product \sin x\cos x spends as much time positive as
negative and the areas cancel:
\langle \sin, \cos\rangle = \int_{-\pi}^{\pi}\sin x\,\cos x\,dx = 0.
Slide the frequencies below and watch the shaded product. When the two waves are orthogonal, the blue
area above the axis exactly cancels the area below, and the total signed area is zero — the
function-space version of a right angle. This orthogonality of sines and cosines is the entire
secret behind Fourier series: they are just an orthonormal basis for a space of functions.
Two waves and their product. The heavy curve is
f(x)\,g(x); its signed area (positive shading above the axis, negative
below) is the inner product \langle f,g\rangle. Choosing different
whole-number frequencies makes the areas cancel — the functions are orthogonal.
Worked example — an inner product on polynomials
Take V = polynomials on [-1,1] with
\langle f,g\rangle = \int_{-1}^{1} f(x)g(x)\,dx. Are
f(x) = 1 and g(x) = x orthogonal? Compute:
\langle 1, x\rangle = \int_{-1}^{1} x\,dx = \left[\tfrac{x^2}{2}\right]_{-1}^{1} = \tfrac12 - \tfrac12 = 0.
Yes — the constant function and the identity are perpendicular in this space. What is the "length" of
the constant 1?
\lVert 1\rVert = \sqrt{\int_{-1}^{1} 1\,dx} = \sqrt{2}.
And x? \lVert x\rVert = \sqrt{\int_{-1}^{1} x^2\,dx} = \sqrt{2/3}.
Dividing each by its length gives an orthonormal pair — the first two Legendre
polynomials, the natural "axes" for polynomial approximation. Same machinery, brand-new
universe.
Quantum mechanics lives over the complex numbers, and there symmetry needs one tweak. If we
kept \langle v, v\rangle = \sum v_i^2 with complex
v_i, the "length-squared" could come out negative or even imaginary —
useless. The fix is to conjugate one slot:
\langle u, v\rangle = \sum_i \overline{u_i}\,v_i. Now
\langle v,v\rangle = \sum |v_i|^2 \ge 0 is real and non-negative again.
The cost is that symmetry becomes conjugate symmetry,
\langle u, v\rangle = \overline{\langle v, u\rangle}, so swapping the
arguments conjugates the answer. The product is then linear in one slot and
conjugate-linear in the other (physicists conjugate the first, mathematicians often
the second — watch which convention a book uses). Everything else — norm, orthogonality, orthonormal
bases — carries over unchanged. This complex version is the correct setting for state vectors, and
it is the direct ancestor of the
bra-ket
notation physicists use.
An inner product is not automatically "multiply the components." The dot product is
one inner product on \mathbb{R}^n, not the definition of the
concept. A different rule such as
\langle u, v\rangle = 2u_1v_1 + 5u_2v_2 (a weighted product) also
satisfies every axiom, and it assigns different lengths and different angles — vectors that are
perpendicular under the ordinary dot product need not be perpendicular under a weighted one. So
"orthogonal" is always relative to a chosen inner product; the question "are these
orthogonal?" is incomplete until you say under which \langle\cdot,\cdot\rangle.
The second classic slip: a rule that looks symmetric and bilinear can still fail to be an
inner product if it flunks positive-definiteness. For instance
\langle u,v\rangle = u_1v_1 - u_2v_2 gives some non-zero vectors a
"length-squared" of zero (or negative), so \lVert\cdot\rVert is not a real
length. (That indefinite form is not a bug everywhere — it is exactly the Minkowski product of
special relativity — but it is not an inner product in the sense of this page.)