Solved – Inequality involving interquartile range and standard deviation

descriptive statistics

Suppose I have a finite set of observations $x_{i}$, $i = 1, 2, \ldots, n$. Are there any inequalities relating the standard deviation and the interquartile range?

Best Answer

The IQR and standard deviation both are proportional to a scale factor, so the proper way to compare the two is with their ratio.

Upper bound for SD:IQR

The Cauchy distribution with PDF

$$\frac{dx / \sigma}{\pi(1 + (x/\sigma)^2)}$$

has infinite SD and quartiles at $\pm\sigma$. From it we can create, via truncation on the left and right, a distribution with arbitrarily large SD while (by adjusting $\sigma$) we can separately make the IQR arbitrarily short. Therefore, for any given IQR there is no upper bound on the SD and for any given SD there is no lower bound on the IQR.

Lower bound for SD:IQR

For any given IQR, we can reduce the SD in two ways: (1) by shifting the middle 50% of the values towards the mid-point of the quartiles and (2) by shifting the outer 50% of the values towards the quartiles. The lower limit of the SD for a fixed IQR is achieved by the family of (discrete) distributions having $25 + \varepsilon$% probability at $-1$ and $1$ and $50 - 2\varepsilon$% probability at $0$ ($0 \lt \varepsilon \lt 25$); members of this family have quartiles at $\pm 1$--whence an IQR of $2$ and SDs of $(50 + 2\varepsilon)/100$; the (lower) limiting ratio of SD to IQR therefore is $1/4$.

(Notice that no member of this family violates Chebyshev's Inequality, provided some care is taken in its statement: $100$% of the probability lies strictly within 2 SDs of the mean ($0$) in every case and in every case there is no ambiguity concerning the positions of the quartiles. However, in the limit as $\varepsilon \to 0$, the ratio of SD to IQR approaches $1/4$. Incorrectly interpreted, this would seem to imply that $50$% of the probability lies beyond $2$ SDs of the mean, whereas Chebyshev's Inequality asserts that no more than $25$% of the probability can lie beyond $2$ SDs of the mean. However, the positions of the quartiles for the limiting distribution with $\varepsilon=0$ are ambiguous: the lower one could be anywhere between $-1$ and $0$ and the upper anywhere between $0$ and $1$ and none of the probability is strictly beyond $2$ SDs from the mean.)

Summary

Because the empirical distribution of a sufficiently large finite sample can approach any given distribution arbitrarily closely, the conclusion--both for theoretical distributions and empirical distributions of data--is that

$$\frac{1}{4} \le \frac{SD}{IQR} \le \infty$$

and these are the best bounds possible.