statistics

Variance

Variance measures the spread of a dataset around its mean. It is the average of squared deviations. Standard deviation is the square root of variance.

Variance measures how far a dataset's values spread from the mean. For a population of NN values x1,,xNx_1, \ldots, x_N with mean μ\mu:

σ2=1Ni=1N(xiμ)2\sigma^2 = \frac{1}{N}\sum_{i=1}^{N}(x_i - \mu)^2

For a sample of nn values with sample mean xˉ\bar{x}, divide by n1n - 1 instead of nn (Bessel's correction, an unbiased estimator).

A small variance means values cluster near the mean; a large variance means they are scattered. Variance is in squared units of the original data (kg² if data is in kg) — that's why we usually report standard deviation σ=σ2\sigma = \sqrt{\sigma^2}, which has the same units as the data.

Variance underlies all of inferential statistics: confidence intervals, hypothesis tests, and regression all depend on estimating variance. The bias-variance tradeoff in machine learning is named for it.

Related resources