Statistics – Is Sample Mean Absolute Deviation Unbiased for Normal Distribution?

descriptive statisticsstatistical-inferencestatistics

Let $X_1,\ldots,X_n$ be a random sample from a normally distributed population.
Is the sample mean average deviation
$$\frac{\sum_{i=1}^n|X_i-\bar{X}|}{n}$$
an unbiased estimator of the population mean average deviation?

Best Answer

Clearly not for $n=1$ when you always get $0$ and (less dramatically) not for larger $n$.

$\frac{\sum_{i=1}^n\left|X_i-\bar{X}\right|}{n}$ faces the same issue as $\frac{\sum_{i=1}^n(X_i-\bar{X})^2}{n}$ in that $\bar X$ tends to be closer to the $X_i$ than $\mu$ is.

For a normal distribution (but not others) $\mathbb E\left[\frac{\sum_{i=1}^n|X_i-\mu|}{n}\right] =\sqrt{\frac{2}{\pi}} \sigma$, while it seems empirically $\mathbb E\left[\frac{\sum_{i=1}^n\left|X_i-\bar{X}\right|}{n}\right] $ $= \sqrt{\frac{n-1}{n}} \sqrt{\frac{2}{\pi}} \sigma$ or close to that.

As an illustration, with a standard normal and sample size $n=4$, the expected absolute distance to the sample average seems to be closer to $\sqrt{\frac3{2\pi}} \approx 0.691$ than the expected absolute distance to the mean of $\sqrt{\frac2{\pi}} \approx 0.798$:

avabsdevnorm <- function(n, mu=0, sigma=1){
  X <- rnorm(n, mu, sigma)
  meanX <- mean(X)
  return(c(mean(abs(X-meanX)), mean(abs(X-mu))))
  } 
set.seed(2023)
n <- 4
cases <- 10^5
sims <- replicate(cases, avabsdevnorm(n))
c(mean(sims[1,]), mean(sims[2,]))
# 0.6917616 0.7990887

For comparison, for a uniform distribution on $[a,b]$, you have $\mathbb E\left[\frac{\sum_{i=1}^n|X_i-\mu|}{n}\right] =\frac{b-a}{4}$, while it seems empirically $\mathbb E\left[\frac{\sum_{i=1}^n\left|X_i-\bar{X}\right|}{n}\right] $ $= \left(1-\frac{2}{3n}\right)\frac{b-a}{4}$ or close to that at least with $n\ge 2$. Another simulation of $U(0,1)$, again with $n=4$, shows the expected absolute distance to the sample average seems to be closer to $\frac5{24} \approx 0.208$ than the expected absolute distance to the mean of $\frac14=0.25$:

avabsdevunif <- function(n, low=0, high=1){
  X <- runif(n, low, high)
  meanX <- mean(X)
  return(c(mean(abs(X-meanX)), mean(abs(X-(high-low)/2))))
  } 
set.seed(2023)
n <- 4
cases <- 10^5
simunif <- replicate(cases, avabsdevunif(n))
c(mean(simunif[1,]), mean(simunif[2,]))
# 0.2081946 0.2498248