Median vs Mean – When is the Median More Affected by Sampling Error?

distributionsmeanmedianrobustsampling

I'm writing a paper on making probability estimates, and it's been asserted to me that I should take the median of the estimates given by my participants, rather than the mean. I've been told I should do this because the mean is more affected by sampling error than the median.

Why is this? Is this something that is always true, or which only holds under certain circumstances?

Best Answer

Imagine that a variable takes values 0 and 1 with probability both 0.5. Sample from that distribution and most of the medians will be 0 or 1 and a very few exactly 0.5. The means will vary far less. The mean is much more stable in this circumstance.

Here is a sample graph of results. The plots are quantile plots, i.e. ordered values versus plotting position, a modified cumulative probability. The results are for 10,000 bootstrap samples from 1000 values, 500 each 0 and 1. The means range fortuitously but nicely from 0.436 to 0.564 with standard error 0.016. The medians are as said, with standard error 0.493. (Closed-form results are no doubt possible here too, but a graph makes the point vivid for all.)

enter image description here

But that is exceptional. It illustrates the least favourable case for medians, a symmetric bimodal distribution such that the median is likely to flip between different halves of the data. However, symmetric bimodal distributions are not especially common, but watch out for so-called U-shaped distributions in which the extremes are most common and intermediate values uncommon. Distributions that are unimodal, or in which the number of modes has only a small effect on median or mean, are more common.

As advised by every treatment of robust statistics, a very common situation is that your data come with tails heavier than Gaussian and/or with outliers, and in those circumstances median will almost always be more robust. The point is that that is not a universal general result.

All that said, what relevance is a general result? You can at a minimum establish by bootstrapping the relative variability of mean and median for your own data. That's what you care about.