Solved – Deriving the Maximum Likelihood Estimation (MLE) of a parameter for an Inverse Gaussian Distribution

estimationmathematical-statisticsmaximum likelihood

Given the following likelihood function

$$f(y|x,\tau) = \prod_{i=0}^Nf_T(u_i-x_i-\tau) \tag{1}$$

where, $f_T(t)$ is the probability density function of an Inverse Gaussian distribution given by

$$f_T(t) = \sqrt\frac{\lambda}{2\pi t^3} \exp\Bigl(- \frac{\lambda (t-\mu)^2}{2\mu^2t}\Bigr)\tag{2}$$

The goal here is to determine the MLE of parameter $\tau$

$$ \hat{\tau}_{MLE} := \mathop{argmax}\limits_\tau f(y|x,\tau) \tag{3}$$

According to the principle of MLE and substituting $(2)$ in $(1)$, we will obtain the folowing

\begin{align}L(\tau) & = \prod_{i=1}^N \sqrt\frac{\lambda}{2\pi (u_i-x_i-\tau)^3} \exp\Bigl(- \frac{\lambda (u_i-x_i-\tau-\mu)^2}{2\mu^2(u_i-x_i-\tau)}\Bigr) \\\\ & =\Bigl(\frac{\lambda}{2\pi }\Bigr)^{N/2} \prod_{i=1}^N(u_i-x_i-\tau)^{-3/2} \exp\Bigl(- \frac{\lambda }{2\mu^2} \sum_{i=1}^N \frac{(u_i-x_i-\tau-\mu)^2}{u_i-x_i-\tau}\Bigr) \tag{4}\end{align}

Taking log, we get

\begin{align} logL(\tau) & = \frac{N}{2} log \Bigl(\frac{\lambda}{2\pi }\Bigr) – \frac{3}{2}\sum_{i=1}^N \log (u_i-x_i-\tau) – \frac{\lambda }{2\mu^2} \sum_{i=1}^N \frac{(u_i-x_i-\tau-\mu)^2}{u_i-x_i-\tau} \tag{5}\end{align}

Now taking the deriative w.r.t. $\tau$

\begin{align} \frac{d(logL(\tau))}{d\tau}& = 0 – \frac{3}{2}\sum_{i=1}^N \frac{1} {(u_i-x_i-\tau)}(-1) – \frac{\lambda }{2\mu^2} \sum_{i=1}^N \left(\frac{2(u_i-x_i-\tau-\mu)}{u_i-x_i-\tau}(-1) – \frac{(u_i-x_i-\tau-\mu)^2}{(u_i-x_i-\tau)^2}(-1) \right)\\\\ & =\frac{3}{2}\sum_{i=1}^N \frac{1} {(u_i-x_i-\tau)}- \frac{\lambda }{2\mu^2} \sum_{i=1}^N \left(\frac{-2(u_i-x_i-\tau-\mu)}{u_i-x_i-\tau} + \frac{(u_i-x_i-\tau-\mu)^2}{(u_i-x_i-\tau)^2}\right) \tag{6} \end{align}

Setting equation $6$ to $0$

\begin{align}\frac{3}{2}\sum_{i=1}^N \frac{1} {(u_i-x_i-\tau)}- \frac{\lambda }{2\mu^2} \sum_{i=1}^N \left(\frac{-2(u_i-x_i-\tau-\mu)}{u_i-x_i-\tau} + \frac{(u_i-x_i-\tau-\mu)^2}{(u_i-x_i-\tau)^2} \right)= 0 \tag{7}\end{align}

Before getting to the problem in hand, is the derivations performed so far correct ?

Here is the bottleneck:

How do I proceed from here? The second summation term has become very complicated and I can't figure out how to derive $\tau$.

[UPDATE 2] as per the inputs from @gunes

$=>\frac{3}{2}\sum_{i=1}^N \frac{1} {(u_i-x_i-\tau)}- \frac{\lambda }{2\mu^2} \sum_{i=1}^N \left(-1+1^2 – 2*1*\left(\frac{u_i-x_i-\tau-\mu}{u_i-x_i-\tau}\right) + \left(\frac{u_i-x_i-\tau-\mu}{u_i-x_i-\tau} \right)^2\right)= 0 $

$=>\frac{3}{2}\sum_{i=1}^N \frac{1} {(u_i-x_i-\tau)}- \frac{\lambda }{2\mu^2} \sum_{i=1}^N -1+\left(\frac{\require{cancel} \cancel{u_i}-\require{cancel} \cancel{x_i}-\require{cancel} \cancel{\tau} -\require{cancel} \cancel{u_i}+\require{cancel} \cancel{x_i}+\require{cancel} \cancel{\tau}+\mu}{u_i-x_i-\tau} \right)^2= 0 $

$=>\frac{3}{2}\sum_{i=1}^N \frac{1} {(u_i-x_i-\tau)}- \frac{\lambda }{2\mu^2} \sum_{i=1}^N -1+\left(\frac{\mu}{u_i-x_i-\tau} \right)^2= 0 $

$=>\frac{3}{2}\sum_{i=1}^N \frac{1} {(u_i-x_i-\tau)}+\frac{N\lambda}{2\mu^2 }- \frac{\lambda N\require{cancel} \cancel{\mu^2} }{2\require{cancel} \cancel{\mu^2}} \sum_{i=1}^N \frac{1}{(u_i-x_i-\tau)^2} = 0 $

$=>\frac{3}{2}\sum_{i=1}^N \frac{1} {(u_i-x_i-\tau)}- \frac{\lambda N }{2} \sum_{i=1}^N \frac{1}{(u_i-x_i-\tau)^2}= -\frac{N\lambda}{2\mu^2 } $

$=>\sum_{i=1}^N \frac{3(u_i-x_i-\tau) – \lambda}{(u_i-x_i-\tau)^2} = -\frac{N\lambda}{\mu^2 } $

$=>\sum_{i=1}^N \frac{ \lambda-3(u_i-x_i-\tau)}{(u_i-x_i-\tau)^2} = \frac{N\lambda}{\mu^2 } $

[UPDATE 3] As per the derivation provided by @Ben

$$1 + 3 H_{-1}(\tau)^2 H_1(\tau)^2 – 5 H_{-1}(\tau) H_1(\tau) + H_1(\tau)^2 H_{-2}(\tau) = 0.$$

As @Ben states below, we are left with the aforementioned equation which is obviously not straight forward to estimate $\tau$.

We are now left with the following questions: How can we solve this numerically? Are there sofware packages that can perform such kind of numerical solutions? Or is it better to write one ourselves?

Best Answer

The full derivation of the MLEs for IID data from an inverse Gaussian distribution can be found in the answer to this related question. In your case you have added an additional layer of complication by having observable data values $t_i = u_i - x_i - \tau$ that depend on some conditioning covariates and an additional parameter. From this formulation, your sampling density is:

$$f(\mathbf{u} | \mathbf{x}, \tau, \mu, \lambda) = \prod_{i=1}^n \Big( \frac{\lambda}{2 \pi (u_i-x_i-\tau)^3} \Big)^{1/2} \exp \Big( - \sum_{i=1}^n \frac{\lambda (u_i-x_i-\tau - \mu)^2}{2 \mu^2 (u_i-x_i-\tau)} \Big)$$

over the support $\mathbf{u} \geqslant \mathbf{x} + \tau \mathbf{1}$. The log-likelihood function is defined over $\tau \leqslant \min (u_i-x_i)$ and is given over this range by:

$$\ell_{\mathbf{u},\mathbf{x}}(\tau, \mu, \lambda) = \text{const} + \frac{n}{2} \ln (\lambda) - \frac{3}{2} \sum_{i=1}^n \ln (u_i-x_i-\tau) - \frac{\lambda}{2 \mu^2 } \sum_{i=1}^n \frac{(u_i-x_i-\tau - \mu)^2}{(u_i-x_i-\tau)}.$$


Finding the MLE: To facilitate our analysis we define the functions:

$$H_k(\tau) \equiv \frac{1}{n} \sum_{i=1}^n (u_i-x_i-\tau)^k.$$

We then have:

$$\begin{equation} \begin{aligned} \frac{\partial \ell_{\mathbf{u},\mathbf{x}}}{\partial \tau}(\tau, \mu, \lambda) &= \frac{3}{2} \sum_{i=1}^n \frac{1}{u_i-x_i-\tau} + \frac{\lambda}{2 \mu^2 } \sum_{i=1}^n \frac{(u_i - x_i - \tau + \mu)(u_i-x_i-\tau - \mu)}{(u_i-x_i-\tau)^2} \\[10pt] &= \frac{3}{2} \sum_{i=1}^n \frac{1}{u_i-x_i-\tau} + \frac{\lambda}{2 \mu^2 } \sum_{i=1}^n \frac{(u_i - x_i - \tau)^2 -2 \mu (u_i-x_i-\tau) + \mu^2}{(u_i-x_i-\tau)^2} \\[10pt] &= \frac{3}{2} \sum_{i=1}^n \frac{1}{u_i-x_i-\tau} + \frac{\lambda}{2 \mu^2 } \Big[ n - 2\mu \sum_{i=1}^n \frac{1}{u_i-x_i-\tau} + \mu^2 \sum_{i=1}^n \frac{1}{(u_i-x_i-\tau)^2} \Big] \\[10pt] &= \frac{3n}{2} H_{-1}(\tau) + \frac{n \lambda}{2 \mu^2 } \Big[ 1 - 2 \mu H_{-1}(\tau) + \mu^2 H_{-2}(\tau) \Big]. \\[10pt] \end{aligned} \end{equation}$$

Taking $\tau$ to be fixed for the moment, the MLEs of the inverse Gaussian distribution are:

$$\hat{\mu}(\tau) = H_1(\tau) \quad \quad \quad \frac{1}{\hat{\lambda}(\tau)} = H_{-1}(\tau) - \frac{1}{H_1(\tau)}.$$

Substituting these functions yields:

$$\begin{equation} \begin{aligned} \frac{\partial \ell_{\mathbf{u},\mathbf{x}}}{\partial \tau}(\tau, \hat{\mu}(\tau), \hat{\lambda}(\tau)) &= \frac{3n}{2} H_{-1}(\tau) + \frac{n}{2 H_1(\tau)^2 } \frac{1 - 2 H_1(\tau) H_{-1}(\tau) + H_1(\tau)^2 H_{-2}(\tau)}{H_{-1}(\tau) - H_1(\tau)^{-1}} \\[10pt] &= \frac{n}{2} \cdot \frac{1}{H_1(\tau)^2} \Big[ 3 H_{-1}(\tau) H_1(\tau)^2 - \frac{2 H_1(\tau) H_{-1}(\tau) - H_1(\tau)^2 H_{-2}(\tau) - 1}{H_{-1}(\tau) - H_1(\tau)^{-1}} \Big]. \\[10pt] \end{aligned} \end{equation}$$

Setting this partial derivative to zero yields the critical point equation:

$$1 + 3 H_{-1}(\tau)^2 H_1(\tau)^2 - 5 H_{-1}(\tau) H_1(\tau) + H_1(\tau)^2 H_{-2}(\tau) = 0.$$

This critical point equation will need to be solved numerically, as there is no simple expression for the solution.

Related Question