[Math] Mean vs. Median: When to Use

meansmediansoft-questionstatisticsterminology

I know the difference between the mean and the median.

  • The mean of a set of numbers is the sum of all the numbers divided by the cardinality.
  • The median of a set of numbers is the middle number, when the set is organized in ascending or descending order (and, when the set has an even cardinality, the mean of the middle two numbers).

It seems to me that they're often used interchangeably, both to give a sense of what's going on in same data.

Do they mean (pun intended) different things? When should one be used over the other?

Best Answer

Almost all analytic calculations on sets of data are more natural in terms of the mean than the median. For example, the "$z$-test for significance of a discrepancy relative to the null hypothesis deals with the sample estimated mean and sample unbiased estimated standard deviation.

The median, and particularly the difference between the median and the mean, is useful to characterize how "skewed" the data is (although the skew, which depends on the third moment about the mean, is also useful for that).

The real use of the median comes when the data set may contain extreme outliers (perhaps due to errors in early processing of the sample numbers, or a serious bias in the sample gathering procedure). Then describing the distribution in terms of quartiles (with the median dividing the second from the third quartile) can be more informative than quoting $\mu$ and $\sigma$.

Related Question