Solved – Proof of invariance property of MLE

maximum likelihood

I am reading the proof of the invariance property of MLE from Casella and Berger.

In this proof we parametrize :
$\eta = \tau(\theta)$

There we define the induced likelihood function:

$ L_{1}^*(\eta|x) = sup_{\theta|\tau(\theta) = \eta} L(\theta|x) \tag{1}$

I have subscripted L*($\eta$|x) by 1 to differentiate between the induced likelihood of $\eta $ and the Likelihood of $\eta$ which are both denoted by $L^*(\eta|x)$

I am not sure why this is being done. (In what follows,L* is the likelihood of $\eta$ ).
If $\theta_1$ and $\theta_2$ are such that $\tau(\theta_1) = \tau(\theta_2)$ then $L(\theta_1|x)$ = $L^*(\eta = \tau(\theta_1)|x)$= $L^*(\eta = \tau(\theta_2)|x)$ = $L(\theta_2|x$) since $\tau(\theta_1)$ =$\tau(\theta_2)$

Hence there is no need of the supremum in (1).

Where do I misunderstand?

Best Answer

Perhaps the issues here are best understood in the context of an example. Suppose that we are interested in estimating the mean of a normal model with variance 1 i.e. we are considering models of the form $N(\theta,1)$. In this case, the likelihood (for a single data point $x$) is (ignoring the constant) $L(\theta | x)=\exp(-(x-\theta)^2/2)$.

Suppose that we are actually interested in a function of the mean, call it $\eta=\tau(\theta)$. How to define the likelihood $L(\eta|x)$? If $\tau$ is invertible then we just define $L(\eta|x)$ to be $L(\theta=\tau^{-1}(\eta) | x)$ i.e. we set $\theta$ equal to the unique value corresponding to the chosen value of $\eta$. e.g. if $\tau(\theta)=2\theta$ then $L(\eta | x):=L(\theta=\frac{\eta}{2} |x)$.

What if $\tau$ is not invertible? e.g. $\tau(\theta)=\theta^2$. Should $L(\eta|x)$ be $L(\theta=+\sqrt{\eta} | x)$ or should it be defined as $L(\theta=-\sqrt{\eta} | x)$? These two values will usually be different, so the likelihood $L(\eta|x)$ is undefined. Hence Casella and Berger define the induced likelihood. With the chosen definition, it turns out that the invariance property (which is obvious when $\tau$ is invertible) still holds.