Statistical Measures – What Does the IQR/Median Mean?

interquartilemedian

What does IQR/Med tell you about the data? What is the purpose of dividing the IQR by the Median?

Examples:

IQR = 3.0  
Med = 8.1  
IQR/Med = .37

IQR = 2.1  
Med = 7.1  
IQR/Med = .29

I got the numbers from this site. I understand IQR and Median, and I was told that IQR/Med is an indicator of data quality (<0.3 means good data). But I don't understand why or how. I need to explain IQR/Med to a general audience, but I need to understand its utility first.

Best Answer

I can't reach the site you link. It's hard to know exactly what was meant by this number without more context. I can make an educated guess, though.

Many numbers will have a greater or lesser amount of variability depending on the level of their center. Starting with stats 101, we often think in terms of the normal distribution, in which case we can think of the standard deviation becoming larger with larger means. In such cases, it is often true that, although there is heteroscedasticity, there is a constant coefficient of variation ($SD/\bar X$). However, people may prefer to work outside of parameters like the mean and standard deviation, which have a strong connection to the normal distribution and are sensitive to outliers. The interquartile range can be considered a more robust measure of spread, and the median can likewise be considered a more robust measure of central tendency. Thus, the IQR/median can fill a similar role as the coefficient of variation while remaining more resistant to outliers. (For what it's worth, for this purpose I might prefer the median absolute deviation from the median, MADM/median, personally.)

I have never heard of the rule of thumb that a CoV (or in this case, IQR/median) <.30 means "good data". In line with the interpretation above, I would guess the idea is that the variability isn't that large relative to the location of the middle of the distribution, so the data will tend to be more stable. That is, small changes to the median won't create large swings in the data.