Solved – When is the distribution of product of two normal distributed variables near normal distribution

distributionsnormal distributionnormality-assumptionrandom variable

It is clear the product of normal distributed variables is not normal distributed. For example, if $X \sim N( \mu_1,\sigma_1^2)$, $Y \sim N( \mu_2,\sigma_2^2)$, then $XY$ does not has the distribution of $ N( \mu_1 \mu_2,\mu_1^2 \sigma_1^2+\mu_2^2\sigma_1^2)$.

I have been told that even if the distribution of $XY$ is not normal distribution, the distribution of $XY$ is near to normal distribution, when $\mu_1$ and $\mu_2$ are not so small, $\sigma_1$ and $\sigma_2$ are not so big. Is it true?

Try following R code:

    n1 <- rnorm(10000,0,.005)
    n2 <- rnorm(10000,0,.005)
    n  <- n1*n2
    d  <- density(n)
    plot(d,lwd=2)
    x  <- par('usr')
    dn <- dnorm(d$x,mean=mean(n),sd=sd(n))
    x  <- seq(x[1],x[2],length.out=length(dn))
    lines(x, dn ,col=2, lwd=2)
    legend('topright', legend=c('Estimated density', 'Normal 
    distribution'), lwd=2, lty=c(1,1),col=c(1,2))

Density estimation when <span class=$\sigma_1=\sigma_2=0.005$" />

It seems only when two conditions are both meet, the distribution is near normal. Is there any theoretical analysis?

Best Answer

(this answer uses parts of @whuber's comment)

Let $X,Y$ be two independent normals. We can write the product as $$ XY = \frac14 \left( (X+Y)^2 - (X-Y)^2 \right) $$ will have the distribution of the difference (scaled) of two noncentral chisquare random variables (central if both have zero means). Note that if the variances are equal, the two terms will be independent. Since chisquare distribution is a case of gamma, Generic sum of Gamma random variables is relevant. I will give a very special case of this, taken from the encyclopedic reference https://www.amazon.com/Probability-Distributions-Involving-Gaussian-Variables/dp/0387346570

When $X$ and $Y$ are independent, zero-mean with possibly different variances the density function of the product $Z=XY$ is given by $$ f(z)= \frac1{\pi \sigma_1 \sigma_2} K_0(\frac{|z|}{\sigma_1 \sigma_2}) $$ where $K_0$ is the modified Bessel function of the second kind.

This can be written in R as

    dprodnorm  <-  function(x, sigma1=1, sigma2=1) {
       (1/(pi*sigma1*sigma2)) * besselK(abs(x)/(sigma1*sigma2),  0)
    }
    ### Numerical check:
    integrate( function(x) dprodnorm(x), lower=-Inf,  upper=Inf)
    0.9999999 with absolute error < 3e-06

Let us plot this, together with some simulations:

    set.seed(7*11*13)  
    Z  <-  rnorm(10000) * rnorm(10000)
    
    hist(Z, prob=TRUE, nclass="scott", ylim=c(0, 1.5), 
            main="histogram and density of product of independent 
                  normals")
    plot( function(x) dprodnorm(x),  from=-5,  to=5,  n=1001,  
          col="red", add=TRUE, lwd=3)
    ### Change to nclass="fd" gives a closer fit

histogram and density of product of two independent normals

The plot shows quite clearly that the distribution is not close to normal.

The stated reference do also give more involved cases (non-zero means ...) but then expressions for density functions becomes so complicated that they only gives characteristic function, which still are reasonably simple, and can be inverted to get densities.

Related Question