Problem
Given are 5 independent standard normal variables $x_1,x_2,x_3,x_4,x_5$.
What is the pdf of $x_4(x_1-x_3)+x_5(x_2-x_1)$ ?
What I know
$$x_1-x_3\sim \mathcal{N}\left(0,\sqrt{2}\right)\tag{1}$$
$$x_2-x_1\sim \mathcal{N}\left(0,\sqrt{2}\right)\tag{2}$$
$$x_4(x_1-x_3)\sim \frac{1}{\pi \sqrt{2}}K_0\left(\frac{\left|z\right|}{\sqrt{2}}\right)\tag{3}$$
$$x_5(x_2-x_1)\sim \frac{1}{\pi \sqrt{2}}K_0\left(\frac{\left|z\right|}{\sqrt{2}}\right)\tag{4}$$
where $K_0$ is the Bessel function and eq.(3,4) is a normal product distribution.
Remains to add the 2 summands $x_4(x_1-x_3)$ and $x_5(x_2-x_1)$ but they are dependent and cannot be convolved.
Related problem from literature
If two variables are transformed like $x_1(a x_1+b x_2)$ with $a\in \mathbb{R},b\in\mathbb{R}^+$ then the pdf is known $[1]$ but this is not applicable here.
Simulation
Simulation shows that the distribution of $x_4(x_1-x_3)+x_5(x_2-x_1)$ can be approximated with a Laplace distribution with parameter $(0,1.34)$. But the Ńorrect answer is not a Laplace distribution.
$[1]$ R. Gaunt: A note on the distribution of the product of zero-mean correlated normal random variables, Statistica Neerlandica, 2018
Best Answer
Linear algebra shows
$$2(x_4(x_1-x_3)+x_5(x_2-x_1)) =(x_2-x_3+x_4+x_5)^2/4-(-x_2+x_3+x_4+x_5)^2/4+\sqrt{3}(-\sqrt{1/3}x_1+\sqrt{1/12}(x_2+x_3)+(1/2)(-x_4+x_5))^2-\sqrt{3}(-\sqrt{1/3}x_1+\sqrt{1/12}(x_2+x_3)+(1/2)(x_4-x_5))^2.$$
Each squared term is a linear combination of independent standard Normal variables scaled to have a variance of $1,$ whence each of those squares has a $\chi^2(1)$ distribution. The four linear combinations are also orthogonal (as a quick check confirms), whence uncorrelated; and because they are uncorrelated joint random variables, they are independent.
(Differences of iid $\chi^2(1)$ variables have Laplace distributions, so this equivalently is the sum of two independent Laplace distributions of different variances.)
Because the characteristic function of a $\chi^2(1)$ variable is
$$\psi(t) = \frac{1}{\sqrt{1-2it}},$$
the characteristic function of this distribution is
$$\psi(t/2) \psi(-t/2) \psi(t\sqrt{3}/2) \psi(-t\sqrt{3}/2) = \left[(1+t^2)(1+3t^2)\right]^{-1/2}.$$
This is not the characteristic function of any Laplace variable -- nor is it recognizable as the c.f. of any standard statistical distribution. I have been unable to find a closed form for its inverse Fourier transform, which would be proportional to the pdf.
Here is a plot of the formula (in red) superimposed on an estimate of $\psi$ based on a sample of 10,000 values (real part in black, imaginary part in gray dots):
The agreement is excellent.
Edit
There remain questions of what the PDF $f$ looks like. It can be computed by numerically inverting the Fourier Transform by computing
$$f(x) = \frac{1}{2\pi}\int_{\mathbb R} e^{-i x t} \psi(t)\,\mathrm{d}t = \frac{1}{2\pi}\int_{\mathbb R} \frac{e^{-i x t}}{\sqrt{(1+t^2)(1+3t^2)}}\,\mathrm{d}t.$$
This expression, by the way, fully answers the original question. The aim of the rest of this section is to show it is a practical answer.
Numerical integration will become problematic once $|x|$ exceeds $10$ or $15,$ but with a little patience can be accurately computed.
In light of the analysis of differences of Gamma variables at https://stats.stackexchange.com/a/72486/919, it is tempting to approximate the result by a mixture of the two Laplace distributions. The best approximation near the middle of the distribution is approximately $0.4$ times Laplace$(1)$ plus $0.6$ times Laplace$(\sqrt{3}).$ However, the tails of this approximation are a little too heavy.
The left hand plot in this figure is a histogram of 100,000 realizations of $x_4(x_1-x_3) + x_5(x_2-x_1).$ On it are superimposed (in black) the numerical calculation of $f$ and then, in red, its mixture approximation. The approximation is so good it coincides with $f.$ However, it's not perfect, as the related plot at right shows. This plots $f$ and its approximation on a logarithmic scale. The decreasing accuracy of the approximation in the tails is clear.
Here is an
R
function for computing values of a PDF that is specified by its characteristic function. It will work for any numerically well-behaved CF (especially one that decays rapidly).As an example of its use, here is how the black graphs in the figure were computed.
The graph is constructed by connecting all these $(x,y)$ values.
This calculation for $101$ values of $|x|$ between $0$ and $15$ takes about one second. It is massively parallelizable.
For more accuracy, increase the
subdivisions
argument--but expect the computation time to increase proportionally. (The figure usedsubdivisions=1e4
.)