Correlation – Sanity Check for Correlation Between Log-Normal Random Variables

correlationlognormal distributionnormal distribution

Recently I came across the following result in a paper, which I can't seem to find right.

Anyway the authors had cited Distributions in Statistics: Continuous Multivariate Distributions by Johnson & Kotz, page 20 as the source. Since I do not have access to this book I am unable to check on the derivation of this result. I was hoping someone could shed some light on how to prove this.

Let $X$ and $Y$ denote a pair of bivariate log normal random variables with their means and variances given by the ordered pairs $(\zeta_{X}, \eta_{X})$ and $(\zeta_{Y}, \eta_{Y})$ and with correlation coefficient $\rho_{(X, Y)}$, derived from the bivariate normal with marginal distributions having parameters $(\xi_{X}, \nu_{X}^2)$ and $(\xi_{Y}, \nu_{Y}^2)$ and correlation coefficient $\rho_{N}$.

The expression for $\rho_{(X, Y)}$ is given as
\begin{equation}
\rho_{(X, Y)} = \dfrac{\mathrm e^{(\rho_{N} \nu_{X} \nu_{Y})} – 1}{\sqrt {(\mathrm e^{\nu_{X}^2} – 1)(\mathrm e^{\nu_{Y}^2} – 1)}}
\end{equation}

The below is a manipulation I came up with, any error in that would be completely mine.

Upon rearranging we get
\begin{equation}\begin{aligned}
\rho_N &= \dfrac{\ln (1 + \rho_{(X, Y)} \sqrt {(\mathrm e^{\nu_{X}^2} – 1)(\mathrm e^{\nu_{Y}^2} – 1)})}{\nu_{X}\nu_{Y}}\\
&= \dfrac{\ln (1 + \rho_{(X, Y)} \frac{\eta_{X}\eta_{Y}}{\zeta{X}\zeta{Y}})}{\sqrt {\ln ((\frac{\eta_{X}}{\zeta_{X}})^2 + 1)\ln ((\frac{\eta_{Y}}{\zeta_{Y}})^2 + 1)}}.
\end{aligned}
\end{equation}

I am very interested in learning if this result would hold true between $X$ and $Y$'s correlations if we had a quad-variate situation as in there were 2 more random variables which are also correlated with $X$ and $Y$ would this result still be true ? I suspect it should be, but I am newbie and due to lack of a credible source I was wondering if some one would be kind to confirm it.

Best Answer

The moments of the lognormal distribution are usually derived from the moment generating function of the normal distribution. If $X\sim N(\mu,\sigma^2)$, then it has the mgf $$ M_X[t] = \mathbb{E} \exp(tX) = \exp( \mu t + \sigma^2 t^2/2 ) $$ Then if $Y = \exp(X)$ is the lognormal variable of interest, we can find, for instance, $$ \mathbb{E}[Y] = \mathbb{E} [\exp(X)] = M_X(1) = \exp( \mu + \sigma^2/2 ) $$ and $$ \mathbb{E}[Y^2] = \mathbb{E} [\exp(2X)] = M_X(2) = \exp( 2\mu + 2\sigma^2 ) $$ from which $$ \mathbb{V}[Y] = \mathbb{E}[Y^2] - \mathbb{E}^2[Y] = \exp( 2\mu + 2\sigma^2 ) - \exp( 2\mu + \sigma^2 ) $$ $$ = \exp( 2\mu + \sigma^2 ) [ \exp(\sigma^2) - 1 ] = [ \exp(\sigma^2) - 1 ]\mathbb{E}^2[Y] $$ You may have to do something like that with a four-variate normal distribution and its multivariate mgf.

In my opinion experience working with lognormal distributions, it is only practical when the log-variance $\sigma$ is less than 1. Beyond that, the sensitivity of pretty much any reasonable summary of the distribution hinges critically on whether you got the tail behavior correctly. In most situations (as is true with the original normal distribution, as well, but is not exacerbated by exponentiation), the shape of the right will be different, and you can easily see a twofold difference due to 0.1 change in $\sigma$... and you don't want that to happen.