I have recently come across this post on Median Absolute Deviation (MAD). The Wikipedia article here, by the article as an estimator Standard deviation of the distribution is 'k' times MAD, where the value of 'k' is dependent on the distribution. For a normal distribution the value is approximately '1.4826' for k. Now, my question is if MAD is calculated for a data set and then standard deviation is calculated and the standard deviation is divided by MAD and it does not come out to be approximately 1.4826 e.g. if it comes out as 6.0, can it be stated that the data distribution is not likely to be normally distributed.
Solved – Can Median Absolute Deviation (MAD)/SD be used to determine if a distribution is normal or not
distributionsmadstandard deviation
Related Solutions
It sounds like you're talking about what's sometimes called a regressogram, with a log-scaled x-variable.
There are a number of issues here, not necessarily in logical order:
the quantity you're plotting is a mean, so if you want to plot median absolute deviation, it's the MAD of the means you want.
your suggestion $\text{MAD}/\sqrt n$ leads to the question "when is the MAD of the mean equal to the MAD of the data divided by $\sqrt n$?"
when you say "it seems that median absolute deviation is a better estimator than mean absolute deviation" ... that depends what we're talking about - a better estimator of what?, and under what circumstances?
So, "when is the MAD of the mean equal to the MAD of the data divided by $\sqrt n$?"
The answer is, unlike the situation with standard deviation, this is not generally the case. The reason why standard deviations of averages scale as they do is that variances of independent random variables add (more precisely, the variance of the sum is the sum of the variances when the variables are independent), irrespective of the distributions of the components (as long as the variances all exist). It is this particular property that largely accounts for the popularity of variances and standard deviations.
Neither the median deviation, nor the mean deviation have that property in general.
However, when the data are normal, they will in effect inherit that property, since the ratio of the population mean deviation or median deviation to the standard deviation at a normal will be a constant, normals are closed under convolution, and standard deviations scale that way.
If the data were reasonably close to normal, it could perhaps be adequate.
What else might be done? One way to estimate the standard error of a statistic is via the bootstrap; for the mean deviation - being a mean - this should do well in large samples. Unfortunately, medians don't do so well under the bootstrap, and this issue will carry over to median absolute deviations.
If you have some probability model for your data, there's also simulation as a way of approaching the problem.
To address the question in comments:
I would like to know if there is a possible range of values of the constant
(I assume the question is intended to be about the median deviation from median.)
The ratio of SD to MAD can be made arbitrarily large.
Take some distribution with a given ratio of SD to MAD. Hold the middle $50\%+\epsilon$ of the distribution fixed (which means MAD is unchanged). Move the tails out further. SD increases. Keep moving it beyond any given finite bound.
The ratio of SD to MAD can easily be made as near to $\sqrt{\frac{1}{2}}$ as desired by (for example) putting $25\%+\epsilon$ at $\pm 1$ and $50\%-2\epsilon$ at 0.
I think that would be as small as it goes.
Best Answer
Considered as a formal test of normality: If $M$ = (sample) median absolute deviation from the median and $s$ = standard deviation, then you could indeed use a measure like $R = M/s$ (or its reciprocal) as a test statistic for a test of normality.
Note however, that such tests cannot tell you something is normal, only - sometimes - that it isn't.
To make it a test, all you'd need is the distribution of/a table of percentiles of the distribution of the ratio under the null (i.e. at normality) for various sample sizes. This can be obtained by simulation, for example -- though it might also be possible to obtain it analytically.
It's actually a close kin to an old test statistic proposed by Geary[1], which was the ratio of mean deviation to standard deviation, sometimes referred to as Geary's $a$ test (because he proposed a number of test statistics it's necessary to distinguish them, and he used $a$ - and later, $a_1$, to denote this ratio of mean deviation to standard deviation).
Geary's $a$ test has quite good power compared to the Shapiro-Wilk test in small to moderate sized samples for a wide range of symmetric alternatives, beating it in a number of situations. To my recollection is has quite good power against heavier tailed cases like the logistic and Laplace. Your proposal should have somewhat similar properties.
Indeed I think that the likelihood ratio test for normality against a Laplace alternative would correspond to looking at the ratio of mean deviation from the median to standard deviation (which would be a third statistic a bit more like Geary's than yours).
[My guess is that Geary's $a$ test statistic would have better power against something like a logistic alternative than yours, but yours might be more competitive with even peakier-and-heavier-tailed alternatives than the Laplace -- an example of an alternative that I'd expect it to do especially well against would be the location-scale family based off the distribution of the product of two independent standard normals. It might also do fairly well against something like a t-distribution with low d.f. It would be interesting to see if such guesses hold up, and whether it does well in other situations.]
Against general alternatives, the power may sometimes be poor, however - for example we should anticipate relatively low power against lightish-tailed, skew alternatives (at least ones that have similar population ratios of median absolute deviation to standard deviation), compared to widely used omnibus tests. However, many skew alternatives of interest are also heavy-tailed, so it may still do fairly well against some of those.
It wouldn't be suitable in every situation but might work very well if you anticipate the kind of alternatives against which it should have reasonably good power.
There are a number of papers that have investigated Geary's test but off the top of my head I don't recall any for your proposed statistic. I'd bet that it has been looked at but I didn't find any papers on it with a quick search.
The closest I came was Gel et al [2] which discusses a test based on the ratio of standard deviation to mean deviation from the median (which they call MAAD), which would be a version of the test I suggested for a Laplace alternative above. They say that compared to the test based on the MAAD, the MAD has lower power against heavy tailed alternatives (which they say is due to the higher variance of MAD at the normal) but they don't give further details (however, they do say that MAD is better for diagnostic displays, which relates to my point 2. below). Aside from that brief passing mention I haven't found anything else on power comparisons.
One big advantage of these kinds of tests is their simplicity; they don't require specialized routines to compute the statistic and are amenable to hand computation in small samples, even for beginning students. In the case of Geary's test there's a normal approximation (D'Agostino, 1970 [3]) for $n>40$; there's likely to be a suitable normal approximation in medium-to-large samples here as well. That they can also have good power in situations we might actually care to identify may make them worth considering -- certainly it could be worth a bit of time investigating the power properties more closely and some investigation to find any previous investigations of the test.
As a diagnostic tool. Rather than a formal test (which may answer a question we already know the answer to instead of one we'd be better to answer), we could use the ratio as a diagnostic -- a measure of how far from normality we might be (in effect as a kind of raw "effect size" of a particular kind of non-normality).
For example, if we're particularly concerned about how heavy-tailed our distribution might be this sort of ratio might be worth considering as a diagnostic measure for that situation, rather than computing something like kurtosis, say.
* (i.e. has relatively high power in that situation)
[1] Geary, R. C. 1935. "The ratio of mean deviation to the standard deviation as a test of normality." Biometrika 27: 310-332
[2] Gel, Y. R., Miao, W., and Gastwirth, J. L. (2007) Robust Directed Tests of Normality Against Heavy Tailed Alternatives. Computational Statistics and Data Analysis 51, 2734-2746.
[3] D'Agostino, Ralph B. (1970),
"Simple compact portable test of normality: Geary's test revisited"
Psychological Bulletin, Vol 74(2), Aug, 138-140