Solved – MAD in relation to 95% confidence

madstandard deviation

MAD (Median Absolute Deviation) is:

$\text{MAD} = M_i(|x_i-M_j(x_j)|)$

where $M()$ is the median operator ($M_i(x_i) = \text{median}(x_1,…,x_n)$).

I'd like to scale the MAD in such a way as to include (say) 95% of a distribution around the median, the way that that 95% of a normal distribution is within $1.96\sigma$ of the mean.

That is, if $m = M_i(x_i)$ and $d = \text{MAD}_i(x_i)$, make an interval like $m \pm b\cdot d$ (where $b$ depends on the distribution you are dealing with) that includes 95% of the distribution.

Can this be done?

Best Answer

I know the original post is over a year old, but I would like some more information on this topic. I currently run a proficiency program for manure testing and soil testing laboratories. A colleague, who knows much more about statistics than I do, suggested the following to get a 95% confidence interval using the MAD and median.

  1. Calculate the median and MAD values.
  2. Remove results exceeding plus or minus 4.0 MAD units from the median as outliers.
  3. Recalculate the median and MAD values on the reduced data set.
  4. Results exceeding plus or minus 2.9 MAD units from the second median are outside the 95% confidence interval.

There is one other kicker. I use the statistical program R. When calculating MAD I use the following:

mad(x, constant = 1)

The default in R is: constant = 1.4826.

Typically, we have from 140 to 200 datapoints for each analysis. Often, the results are right skewed, occasionally left skewed, and rarely normally distributed. After removing the 4.0 MAD outliers, we have a much more normally distributed histogram. I suspect at that point we might be able to use mean and SD to calculate the confidence interval.

For a number of years, we just ran the data one time. Labs were flagged for accuracy if their results deviated by more than 2.5 MAD units from the median. I have compared both methods, and usually 2.5 MAD units from the median (just one calculation) is quite close to the two-step method using 2.9 MAD units from the median after removing the 4.0 outliers.

I hope this method gives us a 95% confidence interval. But, if anyone has a better suggestion, I'd like to hear it.