[Math] Compute probability of a particular ordering of normal random variables

There are $m$ normally distributed, independent random variables $N_1, \ldots, N_m$ with distinct means $\mu_1, \ldots \mu_m$ and standard deviations $\sigma_1, \ldots, \sigma_m$. Then, we get a permutation of the numbers $\{1, \ldots, m\}$. How can we efficiently compute, numerically, the (log) probability of observing the random variables in same ordering as this permutation?

An example:

we have four independent random variables $N_1, N_2, N_3, N_4$, all with different means and variances.
We are given the permutation (3, 1, 2, 4).
What's $\Pr(N_3 > N_1 > N_2 > N_4)$?

A closed-form solution is not necessary, but computing the solution using an efficient algorithm with good accuracy is. Also, it's probably necessary to compute a log probability due to the fact that when the number of variables becomes large, computing the actual probability will result in a floating-point underflow.

Some starting points, perhaps…

The most direct way to compute this value, using the example above, is evaluating one of the following integrals, which I believe are equivalent:

$$ \int_{-\infty}^\infty \int_{n_4}^\infty \int_{n_2}^\infty \int_{n_1}^\infty p(n_1)p(n_2)p(n_3)p(n_4)\ dn_3 dn_1 dn_2 dn_4 $$

$$ \int_{-\infty}^\infty \int_{-\infty}^{n_3} \int_{-\infty}^{n_1} \int_{-\infty}^{n_2} p(n_1)p(n_2)p(n_3)p(n_4)\ dn_4 dn_2 dn_1 dn_3 $$

Where $p(n_i)$ is the density function of the variable $N_i$. However, when I tried to implement this numerically, it is inefficient, prone to inaccuracy, and runs into underflow errors when the number of variables gets large. If you think you can compute this integral in an acceptable way, please do post your answer!

From one of the answers below, we observe that it's possible to compute $\Pr(N_3 > N_1 > N_2 > N_4)$ directly by evaluating a multivariate normal CDF of dimension $(m-1)$, or 3 in this case. However, this is still nontrivial (though there may be libraries for it), and will underflow for many variables.

Perhaps we can divide the probability up as follows:

$$\Pr(N_3 > N_1 > N_2 > N_4) = $$
$$\Pr(N_3 > N_1 \mid N_1 > N_2, N_2 > N_4 )\Pr(N_1 > N_2 \mid N_2 > N_4 )\Pr(N_2 > N_4)$$

Being able to compute the probabilities of each part directly would make it very easy to compute the log probability simply by adding. We can compute the conditional probabilities separately using the MVN CDF method, which could help if the product might underflow.

Another observation: the $m!$ possible probabilities corresponding to the different permutations must sum to 1. Perhaps there is a way to compute the probabilities iteratively or using dynamic programming: i.e.: $(N_2 > N_3)$, an ordering over a pair, has some fixed probability, which is further divided into three values by the three possible places to insert $N_1$ into the ordering, further divided into the four values by the possible places to insert $N_3$. This is semantically equivalent to the conditional probabilities above but it might be easier to think of it this way.

Any math wizards have suggestions on how to solve this problem? I would greatly appreciate any ideas!

Best Answer

To start from your example, note that the unconitional part is easy, since $N_2 - N_4$ is normal with mean $\mu_2 - \mu_4$ and variance $\sigma_2^2 + \sigma_4^2$. Then for the first conditional probability, you have $P(N_1 - N_2 > 0 | N_2 - N_4 > 0)$. Again, these are both normals of which you can compute the covariance. With that, you should be able to compute the probability. As for the first term in your sum, note that the event is independent of the second conditioning events, so you have a case like the first. I believe this should carry over for larger m and you can throw out most of the conditioning terms. Either way, you still just have normals where you can compute the covariance. And since conditioning on dependent normals just works out to linear projections you should be good. I think that answers it.

EDIT: For a slightly clearer explanation, $N_1 - N_2$ and $N_2 - N_4$ can be thought of as $$ \begin{bmatrix}N_1 - N_2\\ N_2 - N_4 \end{bmatrix} = \begin{bmatrix}1 & -1 & 0 & 0 \\ 0 & 1 & 0 & -1 \end{bmatrix}\begin{bmatrix}N_1 \\ N_2 \\ N_3 \\ N_4 \end{bmatrix} $$ Note that we can do this for any number of differences. So computing the covariance matrix is no problem. Then, the conditional distribution of the first term given the rest is given here: https://en.wikipedia.org/wiki/Multivariate_normal_distribution#Conditional_distributions Ah, my good old friend the Schur Complement. I forget the proof though off the top of my head...

EDIT2: Ah, I think I may have been a little sloppy. That's just conditional on a random variable. But I think you can still use the same principle since $P(X>0 | Y>0) = \frac{P(X>0, Y>0)}{P(Y>0)}$ which you should be able to get from the joint distribution.

EDIT3: Since I don't yet have enough reputation to comment on leonbloy's concern, I will post it here. In that example, you have gone from a two dimensional space to a three dimensional space, so the transformation is rank deficient and you get a degenerate covariance matrix in XYZ space.

Some starting points, perhaps…

Best Answer

Related Solutions

Probability – Is the Product of Combinations a Probability?

[Math] $8$ teachers divided among $4$ schools. How many divisions are possible

Related Question