Solved – Expected value of maximum likelihood coin parameter estimate

expected valuemaximum likelihoodprobabilityself-studyvariance

Suppose I have a coin toss experiment in which I want to calculate the maximum likelihood estimate of the coin parameter $p$ when tossing the coin $n$ times. After calculating the derivative of the binomial likelihood function $ L(p) = { n \choose x } p^x (1-p)^{n-x} $, I get the optimal value for $p$ to be $p^{*} = \frac{x}{n}$, with $x$ being the number of successes.

My questions now are:

  • How would I calculate the expected value/variance of this maximum likelihood estimate for $p$?
  • Do I need to calculate the expected value/variance for $L(p^{*})$?
  • If yes, how would I do that?

Best Answer

First of all this is a self-study question, so I'm going to go too much into each and every little technical detail, but I'm not going on a derivation frenzy either. There are many ways to do this. I'll help you by using general properties of the maximum likelihood estimator.

Background information

In order to solve your problem I think you need to study maximum likelihood from the beginning. You are probably using some kind of text book, and the answer should really be there somewhere. I'll help you find out what to look for.

Maximum Likelihood is an estimation method which is basically what we call an M-estimator (think of the "M" as "maximize/minimize"). If the conditions required for using these methods are satisfied, we can show that the parameter estimates are consistent and asymptotically normally distributed, so we have:

$$ \sqrt{N}(\hat\theta-\theta_0)\overset{d}{\to}\text{Normal}(0,A_0^{-1}B_0A_0^{-1}), $$

where $A_0$ and $B_0$ are some matrices. When using maximum likelihood we can show that $A_0=B_0$, and thus we have a simple expression: $$ \sqrt{N}(\hat\theta-\theta_0)\overset{d}{\to}\text{Normal}(0,A_0^{-1}). $$ We have that $A_0\equiv -E(H(\theta_0))$ where $H$ denotes the hessian. This is what you need to estimate in order to get your variance.

Your specific problem

So how do we do it? Here let's call our parameter vector $\theta$ what you do: $p$. This is just a scalar, so our "score" is just the derivative and the "hessian" is just the second order derivative. Our likelihood function can be written as: $$ l(p)=(p)^x (1-p)^{n-x}, $$ which is what we want to maximize. You used the first derivative of this or the log likelihood to find your $p^*$. Instead of setting the first derivative equal to zero, we can differentiate again, to find the second order derivative $H(p)$. First we take logs: $$ ll(p)\equiv\log(l(p))=x\log(p)+(n-x)\log(1-p) $$ Then our 'score' is: $$ ll'(p)=\frac{x}{p}+\frac{n-x}{1-p}, $$ and our 'hessian': $$ H(p)=ll''(p)=-\frac{x}{p^2}-\frac{n-x}{(1-p)^2}. $$ Then our general theory from above just tells you to find $(-E(H(p)))^{-1}$. Now you just have to take the expectation of $H(p)$ (Hint: use $E(x/n)=p$), multiply by $-1$ and take the inverse. Then you'll have your variance of the estimator.

Related Question