Density Function – Probability and Cumulative Density Function for Inequality of Two Random Variables

analyticaldensity functioninverse-gaussian-distributionmaximum likelihood

I have two random variables X and Y which came from different inverse gaussian (IG) distributions:
$$ X \sim IG(\mu_1,\lambda_1)$$
$$ Y \sim IG(\mu_2,\lambda_2)$$

I need to find a probability of X>Y. It seems, that i need to find a pdf and cdf for variable Z=X-Y

$$ p(X-Y|\mu_1,\mu_2,\lambda_1,\lambda_2)$$

Is it possible to find and analytical solution for this problem? I will need these functions for further MLE estimation.

Best Answer

As mentioned by @whuber, you're better off attempting to obtain the distribution of $X/Y$ than $X - Y$. Because both $X$ and $Y$ are supported on the positive reals one can write $Pr(X>Y)$ as $Pr(X/Y > 1)=$. So this will involve obtaining the cdf of $X/Y$.

Using Mathematica (and Maple and MATLAB and others can certainly do the same) I get the following for the pdf:

dist = TransformedDistribution[x/y, 
 {x \[Distributed] InverseGaussianDistribution[n1, s1], 
  y \[Distributed] InverseGaussianDistribution[m2, s2]}];
pdf = Simplify[PDF[dist, t], Assumptions -> t > 0] 

pdf of X/Y

I don't believe that there's a nice closed form for the cdf so you'll need use numerical integration to get the cdf. Assuming you might be doing this in R:

# Define pdf
  pdf <- function(t, m1, s1, m2, s2) {
    (exp(s1/m1 + s2/m2)*
     sqrt((s1*s2*(m1^2*s2 + m2^2*s1*t))/(s1 + s2*t))*
     besselK(sqrt(((m1^2*s2 + m2^2*s1*t)*(s1 + s2*t))/
     t)/(m1*m2),1))/(m1*m2*pi*t)            
  }
  
# Define cdf
  cdf <- function(t0, m1, s1, m2, s2) {
    1 - integrate(pdf, 0, t0, m1=m1, s1=s1, m2=m2, s2=s2)$value  
  }
  
# Pr(X > Y)
  1 - cdf(1, 1, 2, 2, 3)
# [1] 0.7420883

A partial simplification

We have $X \sim IG(\mu_1, \lambda_1)$ and $Y\sim IG(\mu_2,\lambda_2)$ and want to determine $Pr(X>Y)$. Note that we can write $X=\lambda_1 X’$ where $X’\sim IG(\mu_1/\lambda_1, 1)$ and $Y=\lambda_2 Y’$ where $Y’\sim IG(\mu_2/\lambda_2,1)$. We have

$$Pr(X>Y)=Pr(\lambda_1 X’ > \lambda_2 Y’)=Pr\left(\frac{X’}{Y’}>\frac{\lambda_2}{\lambda_1}\right)$$

If we let $\rho_1=\frac{\mu_1}{\lambda_1}$ and $\rho_2=\frac{\mu_2}{\lambda_2}$, then the pdf for $X’/Y’$ is given by

$$\frac{e^{\frac{1}{\rho_1}+\frac{1}{\rho_2}} \sqrt{\frac{\rho_1^2+\rho_2^2 t}{t+1}} K_1\left(\frac{\sqrt{\frac{(t+1) \left(\rho_1^2+t \rho_2^2\right)}{t}}}{\rho_1 \rho_2}\right)}{\pi \rho_1 \rho_2 t}$$

There doesn’t appear to be a closed form for the cdf so numerical integration will need to be used. One would need to evaluate 1 minus the cdf evaluated at $\lambda_2/\lambda_1$ to find $Pr(X>Y)$. So with this simplification to achieve the objective (finding $Pr(X>Y)$), we only need 3 parameters rather than 4: $\mu_1/\lambda_1$, $\mu_2/\lambda_2$, and $\lambda_2/\lambda_1$. However, given the likely need to estimate all 4 parameters ($\mu_1$, $\lambda_1$, $\mu_2$, and $\lambda_2$) there isn’t much of a savings.

Related Question