Log Normal Distribution – Calculating Variance and Standard Deviation

lognormal distributionstandard deviationvariance

I am trying to calculate the variance and standard deviation for a log normal distribution. I was able to calculate the mean after reading this stack exchange article How to calculate a mean and standard deviation for a lognormal distribution using 2 percentiles. Now I want to calculate the variance and standard deviation. I am unclear what formulas I can use. I will calculate this in R.

Best Answer

  1. An obvious choice even for estimating the population mean and variance would be to exponentiate your log-data back to original data and just directly calculate sample mean and sample variance (then you wouldn't have to rely on your assumption that the data are in fact lognormal, an assumption that's almost certainly false).

  2. However, the assumption of lognormality might be close enough to true that it won't be so bad if you assume it anyway.

    In the case where the random variable is actually lognormal (with parameters $\mu$ and $\sigma^2$), the MLE of the $\mu$ parameter will be the sample mean of the logs, and the MLE of the $\sigma^2$ parameter will be $\frac{n-1}{n}\cdot s^2$, a simple rescaling of the sample variance of the logs.

    You could then produce estimates of the population mean ($m=\widehat{E(Y)}$), variance ($v=\widehat{\text{Var}(Y)}$) and hence of sd -

    $\text{estimated mean} = m = e^{\hat{\mu}+\frac12\hat{\sigma}^2}$

    $\text{estimated variance} = v = m^2 \cdot (e^{\hat{\sigma}^2}-1)$

    $\text{estimated s.d.} = \sqrt{v}$

    [However, $m$ and $v$ won't be unbiased. If unbiasedness of either is important to you, you may want to consider other possibilities, including the original suggestion.]

    Note that these are not the only possible estimators of those quantities under that lognormal assumption, but they're reasonably convenient if you must use the log-data.