Estimation – Bias of Moment Estimator in Lognormal Distribution

biasestimationlognormal distributionmoments

I am doing some numerical experiment that consists in sampling a lognormal distribution $X\sim\mathcal{LN}(\mu, \sigma)$, and trying to estimate the moments $\mathbb{E}[X^n]$ by two methods:

Looking at the sample mean of the $X^n$
Estimating $\mu$ and $\sigma^2$ by using the sample means for $\log(X), \log^2(X)$, and then using the fact that for a lognormal distribution, we have $\mathbb{E}[X^n]=\exp(n \mu + (n \sigma)^2/2)$.

The question is:

I find experimentally that the second method performs much better then the first one, when I keep the number of samples fixed, and increase $\mu, \sigma^2$ by some factor T. Is there some simple explanation for this fact?

I'm attaching a figure in which the x-axis is T, while the y axis are the values of $\mathbb{E}[X^2]$ comparing the true values of $\mathbb{E}[X^2] = \exp(2 \mu + 2 \sigma^2)$ (orange line), to the estimated values. method 1 – blue dots, method 2 – green dots. y-axis is in log scale

$True and estimated values for $\mathbb{E}[X^2]$. Blue dots are sample means for $\mathbb{E}[X^2]$ (method 1), while the green dots are the estimated values using method 2. The orange line is calculated from the known $\mu$, $\sigma$ by the same equation as in method 2. y axis is in log scale$

EDIT:

Below is a minimal Mathematica code to produce the results for one T, with the output:

   ClearAll[n,numIterations,sigma,mu,totalTime,data,rmomentFromMuSigma,rmomentSample,rmomentSample]
(* Define variables *)
n=2; numIterations = 10^4; sigma = 0.5; mu=0.1; totalTime = 200;
(* Create log normal data*)
data=RandomVariate[LogNormalDistribution[mu*totalTime,sigma*Sqrt[totalTime]],numIterations];

(* the moment by theory:*)
rmomentTheory = Exp[(n*mu+(n*sigma)^2/2)*totalTime];

(*Calculate directly: *)
rmomentSample = Mean[data^n];

(*Calculate through estimated mu and sigma *)
muNumerical = Mean[Log[data]]; (*numerical \[Mu] (gaussian mean) *)
sigmaSqrNumerical = Mean[Log[data]^2]-(muNumerical)^2; (* numerical gaussian variance *)
rmomentFromMuSigma = Exp[ muNumerical*n + (n ^2sigmaSqrNumerical)/2];

(*output*)
Log@{rmomentTheory, rmomentSample,rmomentFromMuSigma}

Output:

(*Log of {analytic, sample mean of r^2, using mu and sigma} *)
{140., 91.8953, 137.519}

above, the second result is the sample mean of $r^2$, which is below the two other results

Best Answer

There is something puzzling in those results since

the first method provides an unbiased estimator of $\mathbb{E}[X^2]$, namely$$\frac{1}{N}\sum_{i=1}^N X_i^2$$has $\mathbb{E}[X^2]$ as its mean. Hence the blue dots should be around the expected value (orange curve);
the second method provides a biased estimator of $\mathbb{E}[X^2]$, namely$$\mathbb{E}[\exp(n \hat\mu + n^2 \hat{\sigma}^2/2)]>\exp(n \mu + (n \sigma)^2/2)$$when $\hat\mu$ and $\hat\sigma²$ are unbiased estimators of $\mu$ and $\sigma²$ respectively, and it is thus strange that the green dots are aligned with the orange curve.

but they are due to the problem and not to the numerical computations: I repeated the experiment in R and got the following picture with the same colour code and the same sequence of $\mu_T$'s and $\sigma_T$'s, which represents each estimator divided by the true expectation:

Here is the corresponding R code:

moy1=moy2=rep(0,200)
mus=0.14*(1:200)
sigs=sqrt(0.13*(1:200))
tru=exp(2*mus+2*sigs^2)
for (t in 1:200){
x=rnorm(1e5)
moy1[t]=mean(exp(2*sigs[t]*x+2*mus[t]))
moy2[t]=exp(2*mean(sigs[t]*x+mus[t])+2*var(sigs[t]*x+mus[t]))}

plot(moy1/tru,col="blue",ylab="relative mean",xlab="T",cex=.4,pch=19)
abline(h=1,col="orange")
lines((moy2/tru),col="green",cex=.4,pch=19)

Hence there is indeed a collapse of the second empirical moment as $\mu$ and $\sigma$ increase that I would attribute to the enormous increase in the variance of the said second empirical moment as $\mu$ and $\sigma$ increase.

My explanation of this curious phenomenon is that, while $\mathbb{E}[X^2]$ obviously is the mean of $X^2$, it is not a central value: actually the median of $X^2$ is equal to $e^{2\mu}$. When representing the random variable $X^2$ as $\exp\{2\mu+2\sigma\epsilon\}$ where $\epsilon\sim\mathcal{N}(0,1)$, it is clear that, when $\sigma$ is large enough, the random variable $\sigma\epsilon$ is almost never of the magnitude of $\sigma^2$. In other words if $X$ is $\mathcal{LN}(\mu,\sigma)$ $$\begin{align*}\mathbb{P}(X^2>\mathbb{E}[X^2])&=\mathbb{P}(\log\{X^2\}>2\mu+2\sigma^2)\\&=\mathbb{P}(\mu+\sigma\epsilon>\mu+\sigma^2)\\&=\mathbb{P}(\epsilon>\sigma)\\ &=1-\Phi(\sigma)\end{align*}$$ which can be arbitrarily small.

Example

Consider data values known only to lie within the even intervals $[0,2]$, $[2,4]$, etc. I randomly generated 100 of them according to a Lognormal(0,1) distribution. In Mathematica this can be done via

With[{width = 2},
  data = width {Floor[#/width], Floor[#/width] + 1} & /@  
    RandomReal[LogNormalDistribution[0, 1], 100]
];

Here are their tallies:

Interval Count
 [0, 2]    77
 [2, 4]    16
 [4, 6]     5
 [6, 8]     1
[16,18]     1

Finding the MLE for data like this requires two procedures. First, one to compute the contribution of a list of all 100 intervals to the log likelihood:

logLikelihood[data_, m_, s_] := 
With[{f = CDF[LogNormalDistribution[m, s], #] &},
  Sum[Log[f[b[[2]]] - f[b[[1]]]], {b, data}]
];

Second, one to numerically maximize the log likelihood:

mle = NMaximize[{logLikelihood[data, m, s], s > 0}, {m, s}]

The solution reported by Mathematica is

{-77.0669, {m -> -0.014176, s -> 0.952739}}

The first value in the list is the log likelihood and the second (evidently) gives the MLEs of $\mu$ and $\sigma$, respectively. They are comfortably close to their true values.

Other software systems will vary in their syntax, but typically they will work in the same way: one procedure to compute the probabilities and another to maximize the log likelihood determined by those probabilities.

Solved – When to use sample median as an estimator for the median of a lognormal distribution

Apparently the concept of unbiasedness has already been discussed a long time ago. I feel it is a topic worth of dicussion as mean-unbiasedness is a standard requirement for a good estimator but for small sample it does not mean as much as in large sample estimations.

I post these two references as an answer to my second question in the post.

Brown, George W. "On Small-Sample Estimation." The Annals of Mathematical Statistics, vol. 18, no. 4 (Dec., 1947), pp. 582–585. JSTOR 2236236.

Lehmann, E. L. "A General Concept of Unbiasedness" The Annals of Mathematical Statistics, vol. 22, no. 4 (Dec., 1951), pp. 587–592. JSTOR 2236928

Best Answer

Related Solutions

Solved – Lognormal distribution using binned or grouped data

Example

Solved – When to use sample median as an estimator for the median of a lognormal distribution

Related Question