Solved – Does a median-unbiased estimator minimize mean absolute deviance

lognormal distributionmadmedianrunbiased-estimator

This is a follow-up but also a different question of my previous one.

I read on Wikipedia that "A median-unbiased estimator minimizes the risk with respect to the absolute-deviation loss function, as observed by Laplace." However, my Monte Carlo simulation results does not support this argument.

I assume a sample from a log-normal population, $X_1,X_2,…,X_N \sim \mbox{LN}(\mu,\sigma^2)$,where, $\mu$ and $\sigma$ are the log-mean and log-sd,$\beta = \exp(\mu)=50$

The geometric-mean estimator is a median-unbiased estimator for the population median $\exp(\mu)$,

$\hat{\beta}_{\mbox{GM}}= \exp(\hat{\mu})= \exp{(\sum\frac{\log(X_i)}{N})} \sim \mbox{LN}(\mu,\sigma^2/N)$
where, $\mu$ and $\sigma$ are the log-mean and log-sd, $\hat\mu$ and $\hat\sigma$ are the MLEs for $\mu$ and $\sigma$.

While a corrected geometric-mean estimator is a mean-unbiased estimator for the population median.

$\hat{\beta}_{\mbox{CG}}= \exp(\hat{\mu}-\hat\sigma^2/2N)$

I generate samples of size 5 repeatedly from the LN$(\log(50),\sqrt{\log(1+2^2)})$. The replication number is 10,000. The average absolute deviations I got are 25.14 for the geometric-mean estimator and 22.92 for corrected geometric-mean. Why?

BTW, the estimated median absolute deviations are 18.18 for geometric-mean and 18.58 for corrected geometric-mean estimator.

The R script I used are here:

#```{r stackexchange}
#' Calculate the geomean to estimate the lognormal median.
#'
#' This function Calculate the geomean to estimate the lognormal
#' median.
#'
#' @param x a vector.
require(plyr)
GM <- function(x){
    exp(mean(log(x)))
}
#' Calculate the bias corrected geomean to estimate the lognormal
#' median.
#'
#' This function Calculate the bias corrected geomean using the
#' variance of the log of the samples, i.e., $\hat\sigma^2=1/(n-1)
# \Sigma_i(\Log(X_i)-\hat\mu)^2$
#'
#' @param x a vector.
BCGM <- function(x){
y <- log(x)
exp(mean(y)-var(y)/(2*length(y)))
}
#' Calculate the bias corrected geomean to estimate the lognormal
#' median.
#'
#' This function Calculate the bias corrected geomean using
#' $\hat\sigma^2=1/(n)\Sigma_i(\Log(X_i)-\hat\mu)^2$
#'
#' @param x a vector.
CG <- function(x){
y <- log(x)
exp(mean(y)-var(y)/(2*length(y))*(length(y)-1)/length(y))
}

############################

simln <- function(n,mu,sigma,CI=FALSE)
{
    X <- rlnorm(n,mu,sigma)
    Y <- 1/X
    gm <- GM(X)
    cg <- CG(X)
    ##gmk <- log(2)/GM(log(2)*Y) #the same as GM(X)
    ##cgk <- log(2)/CG(log(2)*Y)
    cgk <- 1/CG(Y)
    sm <- median(X)
    if(CI==TRUE) ci <- calCI(X)
    ##bcgm <- BCGM(X)
    ##return(c(gm,cg,bcgm))
    if(CI==FALSE) return(c(GM=gm,CG=cg,CGK=cgk,SM=sm)) else return(c(GM=gm,CG=cg,CGK=cgk,CI=ci[3],SM=sm))
}
cv <-2
mcN <-10000
res <- sapply(1:mcN,function(i){simln(n=5,mu=log(50),sigma=sqrt(log(1+cv^2)), CI=FALSE)})
sumres.mad <- apply(res,1,function(x) mean(abs(x-50)))
sumres.medad <- apply(res,1,function(x) median(abs(x-50)))
sumres.mse <- apply(res,1,function(x) mean((x-50)^2))
#```

#```{r eval=FALSE}
#> sumres.mad
      GM       CG      CGK       SM 
#25.14202 22.91564 29.65724 31.49275 
#> sumres.mse
      GM       CG      CGK       SM 
#1368.209 1031.478 2051.540 2407.218 
#```

Best Answer

If we choose an estimator $\alpha^+$ by the criterion that it minimizes the expected absolute error from the true value $\alpha$

$E=<|\alpha^+-\alpha|> = \int_{-\infty}^{\alpha^+} (\alpha^+-\alpha)f(\alpha) \mathrm{d}\alpha + \int^{\infty}_{\alpha^+} (\alpha-\alpha^+)f(\alpha)\mathrm{d}\alpha $

we require

$\frac{dE}{d\alpha^+} = \int_{-\infty}^{\alpha^+} f(\alpha) \mathrm{d}\alpha - \int^{\infty}_{\alpha^+} f(\alpha) \mathrm{d}\alpha = 0$

which is equivalent to $P(\alpha > \alpha^+) = 1/2$. So $\alpha^+$ is the shown to be the median as following Laplace in 1774.

If you are having trouble with R please ask it in another question on Stack Overflow

Related Question