statistics

Median

The median is the middle value of a sorted dataset. For even-sized data, it is the average of the two middle values. Robust to outliers.

本术语中文版本即将上线。下方暂以英文原文展示。

The median is the middle value of an ordered dataset. With nn data points sorted ascending:

  • If nn is odd, the median is the (n+12)\left(\frac{n+1}{2}\right)-th value.
  • If nn is even, the median is the average of the n2\frac{n}{2}-th and (n2+1)\left(\frac{n}{2}+1\right)-th values.

The median is the most robust of the standard centrality measures. While the mean shifts dramatically with a single extreme outlier, the median is unaffected. This is why economists report median household income rather than mean — Bezos walking onto a city block would push the mean income to millions, while leaving the median untouched.

Use the median for skewed distributions (income, response time, file size). Use the mean when data is roughly symmetric and outliers are rare. The median is also the value that minimises the sum of absolute deviations xic\sum |x_i - c|, paralleling the mean's minimisation of squared deviations.

相关资源