Solved – How to calculate the product of Gaussian variables in Matlab

I want to calculate the distribution of a product of two i.i.d. Gaussian distributed variables a and b. In principle, this should be possible by defining a new variable x with a dirac delta distribution

new variable x with dirac delta distribution

To get the distribution over x (the product of a and b), a and b have to be marginalized out:

calculation of p(x)

Since the dirac delta ties the variables a and b together via the variable x=ab, we can skip one integral and get

one integral skipped

So far so good.

If I now want to calculate this in Matlab, I get a huge difference between the solution with integral and a simple sampling solution as approximation.

Here is my MWE:

mu_a = 2;
sig2_a = 1;

mu_b = 0;
sig2_b = 1;

res = 250;
xmin = -10;
xmax = 10;
x_range = linspace(xmin,xmax,res);

normal = @(x, mu, sig2) 1/sqrt(2*pi*sig2)*...
    exp((-1/2)*((x-mu)^2/sig2));

normal_a = @(a) arrayfun(normal,a,mu_a*ones(size(a)),sig2_a*ones(size(a)));
normal_b = @(b) arrayfun(normal,b,mu_b*ones(size(b)),sig2_b*ones(size(b)));

product_pdf1 = zeros(size(x_range));
product_pdf2 = zeros(size(x_range));
for i = 1:length(x_range)    
    x = x_range(i);

    normal_ab = @(a) normal_a(a).*normal_b(x./a);
    product_pdf1(i) = integral(normal_ab,-Inf,Inf);

    normal_ab = @(b) normal_a(x./b).*normal_b(b);
    product_pdf2(i) = integral(normal_ab,-Inf,Inf);
end

subplot(2,1,1,'replace'); hold all
plot(x_range, normal_a(x_range))
plot(x_range, normal_b(x_range))
plot(x_range, product_pdf1, '--')
plot(x_range, product_pdf2, '--')
axis tight
drawnow

% the sampling solution
N = 50000;
X_sequence = zeros(1,N);
for i=1:N
    a = mu_a + sqrt(sig2_a)*randn();
    b = mu_b + sqrt(sig2_b)*randn();
    x = a*b;
    X_sequence(i) = x;
end

subplot(2,1,2,'replace'); hold all
hist(X_sequence,250)
xlim([xmin xmax])

Note that I calculated the integral once for a and once for b, the output is completely different.

green and red lines: input Gaussians, dashed lines: attempt to calculate the product

Depending on the parameters, one, the other or both of the calculations are clearly wrong. In my example, the sampling solution is much peakier than the better looking integral attempt. The other one is completely wrong.

I now wonder: Is my math wrong? Is my code wrong? Or is it a limitation of Matlab's integral function (e.g. having trouble integrating over discontinuities)?

UPDATE:

Tom spotted a mistake in my derivations. I updated my MWE accordingly and normalized the histogram for better comparison:

mu_a = 2;
sig2_a = 1;

mu_b = 0;
sig2_b = 1;

res = 250;
xmin = -10;
xmax = 10;
x_range = linspace(xmin,xmax,res);

normal = @(x, mu, sig2) 1/sqrt(2*pi*sig2)*...
    exp((-1/2)*((x-mu)^2/sig2));

normal_a = @(a) arrayfun(normal,a,mu_a*ones(size(a)),sig2_a*ones(size(a)));
normal_b = @(b) arrayfun(normal,b,mu_b*ones(size(b)),sig2_b*ones(size(b)));

product_pdf1 = zeros(size(x_range));
product_pdf2 = zeros(size(x_range));
for i = 1:length(x_range)    
    x = x_range(i);

    normal_ab = @(a) (1./a).*normal_a(a).*normal_b(x./a);
    product_pdf1(i) = integral(normal_ab,-Inf,Inf);

    normal_ab = @(b) (1./b).*normal_a(x./b).*normal_b(b);
    product_pdf2(i) = integral(normal_ab,-Inf,Inf);
end

subplot(2,1,1,'replace'); hold all
plot(x_range, normal_a(x_range))
plot(x_range, normal_b(x_range))
plot(x_range, product_pdf1, '--')
plot(x_range, product_pdf2, '--')
axis tight
drawnow

% the sampling solution
N = 50000;
X_sequence = zeros(1,N);
for i=1:N
    a = mu_a + sqrt(sig2_a)*randn();
    b = mu_b + sqrt(sig2_b)*randn();
    x = a*b;
    X_sequence(i) = x;
end

subplot(2,1,2,'replace'); hold all
hres = 250; % histogram resolution
[B, X] = hist(X_sequence,hres);
dx = diff(X(1:2));
plot(x_range, product_pdf1,'--')
plot(X, B/sum(B*dx),'-.');
plot(x_range, besselk(0,abs(x_range))/pi)
xlim([xmin xmax])

But, strangely, the solutions still don't match:
enter image description here
One of the integral solutions is now completely off, but the other (lower plot, dot-dashed red) is now actually closer to the sampling solution (dashed green). The BesselK doesn't match, it is much peakier.

Further suggestions?

Best Answer

You made a mistake in the derivation. Let's start from this integral: $$ p(x) = \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} \delta(ab-x) p(a) p(b) da ~ db $$ To remove the $\delta$, you need to make a change of variable from $b$ to $c=ab$, for which $dc = a ~ db$. The integral becomes: $$ p(x) = \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} \delta(c-x) p(a) p(c/a) \frac{1}{|a|} da ~ dc \\ = \int_{-\infty}^{\infty} p(a) p(x/a) \frac{1}{|a|} da $$ When $p(a)$ is standard normal, $p(x) = \frac{1}{\pi} {\rm BesselK}(0,|x|)$. You can compute this in Matlab with the code besselk(0,abs(x))/pi.

Best Answer

Related Solutions

Solved – Plotting relationship between three variables in Matlab

Solved – Generating a Gaussian dataset in MATLAB

Related Question