statistics

Median

The median is the middle value of a sorted dataset. For even-sized data, it is the average of the two middle values. Robust to outliers.

The median is the middle value of an ordered dataset. With nn data points sorted ascending:

  • If nn is odd, the median is the (n+12)\left(\frac{n+1}{2}\right)-th value.
  • If nn is even, the median is the average of the n2\frac{n}{2}-th and (n2+1)\left(\frac{n}{2}+1\right)-th values.

The median is the most robust of the standard centrality measures. While the mean shifts dramatically with a single extreme outlier, the median is unaffected. This is why economists report median household income rather than mean — Bezos walking onto a city block would push the mean income to millions, while leaving the median untouched.

Use the median for skewed distributions (income, response time, file size). Use the mean when data is roughly symmetric and outliers are rare. The median is also the value that minimises the sum of absolute deviations xic\sum |x_i - c|, paralleling the mean's minimisation of squared deviations.

Related resources