Solved – Understanding MAD Results

madr

I'm using R to calculate the median absolute deviation for a few distributions, but some of the values I'm calculating do not seem realistic at all. I have the following distribution:

x <-   [1]     NA     NA     NA -0.003 -0.009  0.004 -0.001 -0.001 -0.003  0.001 -0.002  0.000 -0.003  0.000  0.006 -0.011 -0.003
 [18]  0.002 -0.007 -0.002  0.006 -0.005  0.000  0.008  0.001  0.009 -0.002  0.001  0.001  0.002  0.003     NA     NA  0.001
 [35]     NA  0.005 -0.002  0.003  0.016  0.007 -0.003 -0.017  0.000 -0.013  0.000  0.002  0.002  0.000     NA  0.000  0.000
 [52]  0.000  0.000  0.004 -0.001  0.000 -0.002 -0.003 -0.007 -0.001 -0.001  0.000 -0.002  0.001  0.003  0.000 -0.011 -0.002
 [69] -0.003  0.004 -0.007     NA -0.009  0.005 -0.001  0.001 -0.001  0.001 -0.001  0.006  0.002 -0.006  0.002 -0.002  0.004
 [86]  0.006  0.001  0.000  0.002 -0.002  0.007  0.004  0.003  0.004  0.005 -0.005  0.003 -0.003  0.002  0.004  0.003 -0.002
[103] -0.002  0.001  0.002  0.000  0.000  0.003 -0.001  0.004  0.001  0.001  0.005 -0.001     NA -0.005  0.000 -0.002 -0.004
[120]  0.004     NA  0.007  0.000  0.002  0.003 -0.006 -0.002  0.000 -0.002 -0.001 -0.001 -0.001 -0.006 -0.001 -0.001 -0.008
[137]  0.000  0.003  0.001  0.001 -0.001  0.000  0.011 -0.017     NA     NA     NA

Then I used the following code to generate my MAD value:

MADx <- mad(x, center = median(x, na.rm = TRUE), constant = (1/(quantile(x, probs=0.75, na.rm = TRUE, names = FALSE, type = 1))), na.rm = TRUE, low = FALSE, high = FALSE)

I get a value of 1 when doing this, which seems unrealistic because the values I have are much less than 1.

I used the quantile function to get the 75th quantile of the distribution.

Best Answer

By scaling MAD by the 75th percentile of the data, you've rescaled your MAD by what in some cases is another measure of spread of the same data. Observe that if the dist'n is centered at 0, and is symmetric, the 75th percentile of the distribution is equal to 1/2 the interquartile range (the 75th percentile - the 25th percentile.) (Actually all that's needed is that the 25th percentile and 75th percentiles are symmetric around 0.) If the distribution is symmetric, the MAD of the distribution will also be equal to 1/2 the interquartile range.

In your case, the discreteness of the data means that you can come up with 1 for the ratio in finite samples without much difficulty...

You should put constant=1 in if you want the pure, unadulterated MAD.