Solved – Minimizing the median absolute deviation or median absolute error

medianregressionrobustterminology

The median of a vector $\vec x$ is a scalar $a$ minimizing the mean of $|\vec x – a|$. Analogously, when quantile regression is used to estimate medians, it tries to minimize the mean of the absolute residuals.

But suppose we consider the median of the absolute deviations rather than their mean (or sum). The median of $\vec x$ need not minimize the median of $|\vec x – a|$. Building a regression model that tries to minimize the median of the absolute residuals has a certain intuitive appeal, in that median absolute error has a natural interpretation as a distance around true values that predictions are as likely as not to fall within.

This leads me to wonder:

  1. Is there a name for the value $a$ that minimizes the median of $|\vec x – a|$? What about a regression model minimizing the median absolute residual?
  2. Are there any better algorithms for calculating this value than using a generic function-minimizing routine like R's optim? How about algorithms for fitting this sort of regression model?

Best Answer

The shortest half is the shortest interval containing half the distribution or data (when dealing with populations or samples respectively). [Some authors call this interval of the shortest half the shorth, though the term seems to have been coined by Andrews et al (1972) who used it to refer to the mean of the observations in the shortest half, so it would more properly refer to that. Probably best to just explicitly say shortest half and mean of the shortest half to avoid that potential confusion]

The midpoint of the shortest half should minimize the median of the absolute deviations; you sometimes see it called "the midpoint of the shortest half", but it has another name (see below).

This is a one-dimensional version of a minimum volume estimator.

Because quantiles are equivariant to monotonic-increasing transformation, in one dimension we can see that minimizing the median of the absolute deviations is equivalent to minimizing the median of the squared deviations [or any other monotonic increasing function of them -- at least if we keep our definition of medians as interval-valued when they don't fall exactly at observations, otherwise they'll differ slightly but always lie between the same observations].

So the literature on least median of squares (LMS) estimation will probably be of some use to you here. e.g. see Rousseeuw & Leroy, 1987 [1], for example

There's often explicit code for LMS estimators (especially for regression, but if you only fit an intercept ... you should get the original thing you asked about) and sometimes code for producing estimators based on the shortest half (e.g. Nick Cox seems to have written one for Stata, for example)

So the alternative name I referred to earlier would be the "least median of squares estimate of location".

Sorry both terms seem to be such a mouthful; off the top of my head I don't know any reasonably unambiguous names that are shorter.

[1] Rousseeuw, P.J. and Leroy, A.M. (1987),
Robust Regression and Outlier Detection,
Wiley, New York.

Related Question