What is a distribution of $X$ given $X-Y>0$

conditional probabilitynormal distributionprobabilityprobability distributionsstatistics

given two normally distributed variables $X\sim\mathcal N(\mu_X,\sigma_X)$ and $Y\sim\mathcal N(\mu_Y,\sigma_Y)$ that are independent $\rho_{XY} =0$, what is a distribution of $X$ given $X>Y$.

I have run several simulations in Matlab and $X|X>Y$ looks suspiciously normally distributed, but what are its mean and standard deviation?

Thank you for your help.

EDIT: I agree with comments below. Although it looks suspiciously normally distributed, it definitely isn't.

Best Answer

Instead of thinking through the inequality $X>Y$, see it as $Z = X-Y>0$. Note that given that $X$ and $Y$ are both normal, $Z$ is normal with mean $\mu_x-\mu_Y$ and variance $\sigma_X^2 + \sigma_Y^2$ (assuming independence). Thus, you can calculate $$ \begin{aligned} P\{X\leq x | X>Y\} &= \frac{P\{X\leq x, X>Y\}}{P\{X>Y\}}\\ &= \frac{P\{Y<X\leq x\}}{P\{Z>0\}} \end{aligned} $$ I assume you can work out $P\{Z>0\}$ with ease, for it is the distribution function of a normal r.v.. The hard part is $P\{Y<X\leq x\}$. You can draw the area in $\mathbb{R}^2$ to see how to integrate it. I believe it is as follows (I changed $x$ for $z$ so there is no confusion in the integrands and limits): $$ \begin{aligned} P\{Y<X\leq z\} &= \int_{-\infty}^z \int_{-\infty}^x f(x,y)dydx \\ &= \int_{-\infty}^z \int_{-\infty}^x f(x) f(y) dydx\quad (\text{Independence})\\ &= \int_{-\infty}^z f(x) \int_{-\infty}^x f(y)dy dx \\ &= \int_{-\infty}^z f(x) F_Y(x) dx \end{aligned} $$ I do not know if this has a closed form, but you could try to calculate at least the density through Leibniz rule, but, as @fGDu94 noted, I don't think this distribution is "trivially" normal.

However, we could try to solve the simplest case, which is that where $X$ and $Y$ are standard normal r.v.. In such case, we have that $$ \begin{aligned} P\{Y<X\leq z\} &= \int_{-\infty}^z f(x) F_Y(x) dx \\ &= \int_{-\infty}^z \phi(x) \Phi(x) dx \\ &= \int_0^{\Phi(z)} u du = 0.5\Phi(z)^2 \end{aligned} $$ where we used the variable change $u = \Phi(x)$, and thus $du = \phi(x) dx$, where $\Phi$ is the distribution function of a standard normal and $\phi$ its density. Moreover, now $Z$ has mean 0 so $P(Z>0) = 0.5$. Thus, $$ P\{X\leq x | X>Y\} = 0.5\Phi(x)^2/0.5 = \Phi(x)^2. $$ Then, $f(x|X>Y) = 2\Phi(x) \phi(x)$. The image below shows a plot of the standard normal density $\phi(x)$ and $f(x|X>Y)$ as given before. You can see they look very similar, as you guessed through simulation. However, the r.v. $X|X>Y$ is not normal even under the simplest of assumptions. enter image description here

Related Question