Solved – Five-number summary and mean

descriptive statistics

Why was not mean included in the five-number summary, when it was first conceived? What was the motivation of choosing sample minimum & maximum, lower & upper quartile and median?

Best Answer

The five-number summary was, I believe, introduced by John W. Tukey about 1970. The point was that once you have ordered the data (e.g. using a stem-and-leaf plot), then those summaries could be produced by at most counting and averaging pairs of values. The context was pencil and paper methods for tens or (say) a few hundred values.

Now it is, as we all know, immensely more likely that people have their data on a computer and may even be unused to mechanical arithmetic such as adding numbers and dividing by 2. But there is usually no difficulty in calculating a mean. Whether a mean is a useful summary is open to discussion, but we can always have a look.

The five-number summary idea lives on in the form of box plots. Arguably, box plots have even been oversold, as when box plots without means or SDs are presented as cognate to analysis of variance. More on that in this thread