Standard Deviation
The variance
captures spread perfectly, but in the wrong units — squared ones. The fix is delightfully
simple: take the square root and you are back in the data's own units. That square root is the
standard deviation.
s = \sqrt{\sigma^2} = \sqrt{\frac{1}{n}\sum_{i=1}^{n}\left(x_i - \bar{x}\right)^2}.
If the data are heights in centimetres, the standard deviation is in centimetres too — a number
you can picture and lay right alongside the values themselves.
What it means: the typical distance from the mean
Read the standard deviation as the typical distance of a data point from the
mean. A small s says the values huddle close to the
centre; a large s says they wander far from it. It is the single
most quoted measure of spread precisely because it is interpretable: "scores averaged
70, give or take about 8" is a standard
deviation talking.
See it: one step either side of the mean
The dots are a fixed data set with mean 5 (the dashed line). The band
stretches from \bar{x} - s to \bar{x} + s
— one standard deviation in each direction. Slide s until the band
comfortably covers the bulk of the points: for this set the true standard deviation is
2, and a band of that width swallows most of the data, just as a
"typical distance" should.
Standard deviation versus variance
They are two views of the same spread. The variance is the natural quantity in the algebra (it
adds cleanly, and it is what later theory is written in); the standard deviation is the natural
quantity for a human (right units, "typical distance"). Always:
s = \sqrt{\sigma^2} and \sigma^2 = s^2.
- The standard deviation is the square root of the variance: s = \sqrt{\sigma^2}.
- It is measured in the same units as the data, unlike the variance.
- It reads as the typical distance of a value from the mean.
- Squaring it returns the variance: \sigma^2 = s^2.