Solved – The pdf of multivariate normal distribution with high correlation values

correlationdensity function

Does the pdf of an mvn variable even exist when there is high correlation?

I want to use an algorithm (actually it is the cross-entropy method for estimating a rare-event probability) that needs the pdf value of the mvn distribution, that is mvnpdf(x,mu,Sigma).

However, my Sigma is close to singular, meaning there is a high correlation between the variables in the vector, so it is of course difficult/impossible to find the inverse of Sigma. Is there any way to overcome this problem?

Is it not true that for instance a vector [a,b,c,d] with covariance matrix [1,1,0,0;1,1,0,0;0,0,1,1;0,0,1,1] (singular!) will behave in the exact same way as the vector [a,c] with covariance matrix [1,0;0,1] (now non-singular!), while ignoring b and c? Does this mean that I can approximate the pdf value of [a,b,c,d] by the pdf of [a,c]?

Sorry if this is answered before, I've really tried to search for it.

Thank you.

EDIT: The thing is that I want to simulate samples from a multivariate normal distribution, $x \sim N(\mu,\sigma)$, in order to find the probability that ${r(x) < 1}$, where $r(x)$ returns a positive real number based on the vector $x$. So I simulate $x_i \sim N(\mu,\Sigma), i = 1 ,…, M$, and I estimate the probability by the Monte Carlo estimate $p = (1/M)\sum_i I(r(x_i)<1)$. No problems so far. However, since ${r(x_i)<1}$ is a very rare event, I want to try importance sampling instead, simulating from the distr $N(\mu_2,\Sigma_2)$ so I need the pdf value to estimate $p = (1/M)\sum{I(r(x_i)<1) pdf_1(x_i)/pdf_2(x_i)}$.

Best Answer

In your example of $[a,b,c,d]$, $a$ and $b$ are the same random variable for all intents and purposes since they take on the same value with probability $1$. Similarly, $c$ and $d$ are the same random variable. The question is what you want to do with the multivariate distribution. If you want to calculate $P\{a \in A, b \in B, c\in C, d \in D\}$ where $A$, $B$, $C$, $D$ are (measurable) sets of real numbers, you simplify it to $P\{a \in (A\cap B), c \in (C\cap D)\}$ and calculate it from the bivariate density of $a$ and $c$ which in this instance is that of two independent standard normal random variables. In other words, you don't blindly apply the formulas for the multivariate normal density of four normal random variables; you think a bit and re-state the problem to be solved in a way that it can be solved.

Related Question