Solved – Can the mean lie outside the the interquartile range (Q1 and Q3)? and if so what does it do to the distribution

interquartilemean

Can the mean lie outside of the interquartile range? I realize that extreme outliers can affect or pull the mean, but can it pull the mean outside of the interval from the first quartile to the third quartile?

Best Answer

If "mean" refers to a statistic for a batch of data, then consider the dataset $(1,2,3,4,10^6)$ whose quartiles must lie between $1$ and $4$ (depending on how you compute them) but whose mean is $200,002$.

If instead it refers to a property of a distribution, then assign a probability of $1/5$ to each of the five numbers in the previous batch to create a (discrete) distribution. The same calculations apply, leading to the same conclusions.


The point is that quartiles are resistant to changes in the data, whereas the mean is sensitive to changes in even any one data value. When we add $\epsilon$ to any single value in a dataset of $n\gt 4$ numbers, the mean changes by $\epsilon/n$--which may be arbitrarily large--but the quartiles (if they change at all) only shift to the neighboring values in the original dataset and therefore are limited in how much they can change. The preceding example exploited this in an extreme way.

Influence functions study how such changes in data values create changes in statistical summaries of those values.

Related Question