Solved – Self-study: Finding the maximum likelihood estimates of the parameters of a density function – UPDATED

estimationestimatorsexpectation-maximizationmaximum likelihoodself-study

UPDATED

I am trying to find maximum likelihood estimation of a probability distribution function given below

\begin{equation}
g(x)=\frac{1}{\Gamma \left( \alpha \right)\gamma^{2\alpha}2^{\alpha-1}}x^{2\alpha-1}\exp\left\{{-\frac{x^2}{2\gamma^{2}}}\right\}I_{{\rm I\!R}^{+}}(x)
\end{equation}

where $\alpha >0$ is the shape parameter, $\sigma >0$ is scale parameter.

The likelihood function is then given by

\begin{equation}
L(\alpha,\gamma/x)=\prod\limits_{i=1}^{n}\frac{1}{\Gamma \left( \alpha \right)\gamma^{2\alpha}2^{\alpha-1}}x_i^{2\alpha-1}\exp\left\{{-\frac{x_i^2}{2\gamma^{2}}}\right\}
\end{equation}

Thus, the complete likelihood function is then
\begin{equation}
L(\alpha,\gamma/x)=\frac{1}{[\Gamma \left( \alpha \right)]^{n}\gamma^{2\alpha n}{2^{n\alpha-n}}} \exp\left\{{-\frac{1}{2\gamma^{2}}\sum\limits_{i=1}^{n}x_{i}^{2}}\right\}\left(\prod\limits_{i=1}^{n}x_{i}\right)^{2\alpha-1}
\end{equation}

Now, the log-likelihood function denoted by $\ell$ is

\begin{equation}
\begin{aligned}
\ell &=\log[L(\alpha,\gamma/x)]\\
&=-n\log(\Gamma \left( \alpha \right))-2\alpha n \log(\gamma)-n(\alpha-1)\log(2)-\frac{1}{2\gamma^{2}}\sum\limits_{i=1}^{n}x_{i}^{2}+(2\alpha-1)\sum\limits_{i=1}^{n}\log(x_{i})
\end{aligned}
\end{equation}

The entries of the score function are given by

\begin{equation}
\begin{aligned}
\frac{\partial \ell}{\partial \alpha}=-n\psi(\alpha)-2n\log(\gamma)-n\log(2)+2\sum\limits_{i=1}^{n}\log(x_{i})
\end{aligned}
\end{equation}
where $\psi(\alpha)$ is the digamma function and

\begin{equation}
\begin{aligned}
\frac{\partial \ell}{\partial \gamma}=-\frac{2\alpha n}{\gamma}+\frac{\sum\limits_{i=1}^{n}x_{i}^{2}}{\gamma^{3}}
\end{aligned}
\end{equation}

Setting these two equations to zero and solving them simultaneously results in maximum likelihood estimates (MLE) of parameters, $\hat{\alpha}$ and $\hat{\gamma}$. However, the equations obtained by setting the above partial derivatives to zero are not in closed form and the values of parameters $\alpha$ and $\gamma$ must be found using iterative methods.

Fisher information matrix is defined as $I_{ij}=-E\left\{\frac{\partial^{2} \ell}{\partial \tau_i \partial \tau_j} \log[L(x_i, \vec{\tau})]\ \right\}$ where $\tau_1=\alpha$ and $\tau_2=\gamma$. Thus, information matrix for gamma-rayleigh distribution is given by,

\begin{equation}
I=n \left[ \begin{array}{cc}
\psi_{1}(\alpha) & 2/\gamma\\
2/\gamma & 4\alpha/\gamma^2
\end{array} \right]
\end{equation}

I am trying to use Fisher Scoring to find MLEs of the parameters. Here is my MATLAB code. I first generate 1000 random observations from gamma-distribution and run this code. My starting values and the rest are given in the code.

clear all;
clc;

%Simulate 1000 sample from Gamma Distribution
n=1000;
alpha=3;
lambda=0.05;
x=gamrnd(alpha,1/lambda,1,n);

figure(1)
histfit(x,8,'gam');

sumlogx=sum(log(x)); sumxsquare=sum(x.^2);

%Initial Values
alpha=mean(x)^2/var(x);
gam=mean(x)/var(x);
theta=[alpha; gam];
S=Inf;

while sum(abs(S) > 10^(-5)) > 0
    S=[-n*psi(theta(1))-2*n*log(theta(2))-n*log(2)+2*sumlogx;...
        (-2*theta(1)*n/theta(2))+(sumxsquare/(theta(2)^3))];
    FIM=n*[psi(1, theta(1)), 2/theta(2);...
        2/theta(2), 4*theta(1)/(theta(2)^2)];
    theta=theta + FIM\S;
end

alpha_hat=theta(1)
gam_hat=theta(2)

fprintf('alpha_hat=%g, gamma_hat=%g \n', theta(1),theta(2))

But for some reasons I cannot figure out, I am getting "Error using psi
X must be nonnegative.
" error. My $\alpha$ values are being negative in the iteration somehow and I do not know how to fix it!

I am also running Newton-Raphson whose MATLAB code is given below

clear all;
clc;

%Simulate 100 sample from Gamma Distribution
n=1000;
alpha=3;
lambda=0.05;
x=gamrnd(alpha,1/lambda,1,n);

figure(1)
histfit(x,8,'gam');

sumlogx=sum(log(x)); sumxsquare=sum(x.^2);

%tuning parameters scale=gamma; shape=alpha
itermin=10^-7;
maxiter=10^7;
sc_init=0.000001;
sh_init=0.000001;
converged=[0;0;sc_init;sh_init];

% pdf
pdf=@(x,gam,alpha) 1/(gamma(alpha)*(gam^(2*alpha))*(2^(alpha-1)))*(x^(2*alpha-1))*exp(-(x^2)/(2*(gam^2)));

%score function is the first partial derivative of the log likelihood function
score=@(gam,alpha) -n*psi(alpha)-2*n*log(gam)-n*log(2)+2*sumlogx;

%Hessian function is the negative of the 2nd
hessian=@(gam,alpha) psi(1, alpha);

sc_loop=2; 
scale_hat=zeros(1,maxiter); 
scale_hat(1)=sc_init;

while 1==1
sh_loop=2;
shape_hat=zeros(1,maxiter);
shape_hat(1)=sh_init;

while 1==1
%calculate chat as chat_prev+score(chat_prev)/hessian(chat_prev)
shape_hat(sh_loop)=shape_hat(sh_loop-1)+score(scale_hat(sc_loop-1),shape_hat(sh_loop-1))/hessian(scale_hat(sc_loop-1),shape_hat(sh_loop-1));
%test for a convergence
if abs(shape_hat(sh_loop)-shape_hat(sh_loop-1))<itermin
    break %the process converged to a c value
elseif sh_loop>maxiter
    disp(['max iteration on \alpha achieved:', num2str(maxiter)]);
    return
end
sh_loop=sh_loop+1;
end

scale_hat(sc_loop)=(sum(x.^shape_hat(sh_loop-1))/n)^(1/shape_hat(sh_loop-1));
 %test for a convergence
  if abs(scale_hat(sc_loop)-scale_hat(sc_loop-1))<itermin
        break %the process converged to a gamma value
  end

  converged=[converged,[sc_loop-1;sh_loop-1;scale_hat(sc_loop);shape_hat(sh_loop)]];
  sc_loop=sc_loop+1;
end

%final display
disp(repmat('-',[1,30])),disp(' Iteration Scale Shape'),disp(repmat('-',[1,30]))
disp(num2str(converged','%6.4f')),disp(repmat('-',[1,30]))
disp(['Real values: gamma=', num2str(gam),',alpha=',num2str(alpha)])

I am getting the same "Error using psi, X must be nonnegative." error.

Could you help me about it? Something is wrong with psi function and I do not know. Maybe I should use approximation but I am not sure how much of the information that I will loose!

Best Answer

[Note: This is my answer to the Dec. 19, 2014, version of the question.]

If you operate the change of variable $y=x^2$ in your density $$f_X(x|\alpha,\beta,\sigma)=\frac{1}{\Gamma \left( \alpha \right)\beta^{\alpha}}\exp\left\{{-\frac{x^2}{2\sigma^{2}}\frac{1}{\beta}}\right\}\frac{x^{2\alpha-1}}{2^{\alpha-1}\sigma^{2\alpha}}\mathbb{I}_{{\mathbb{R}}^{+}}(x) $$ the Jacobian is given by $\dfrac{\text{d}y}{\text{d}x}= 2x = 2y^{1/2}$ and hence \begin{align*} f_Y(y|\alpha,\beta,\sigma)&=\frac{1}{\Gamma \left( \alpha \right)\beta^{\alpha}}\exp\left\{{-\frac{y}{2\sigma^{2}}\frac{1}{\beta}}\right\}\frac{y^{\frac{2\alpha-1}{2}}}{2^{\alpha-1}\sigma^{2\alpha}}\frac{1}{2 y^{1/2}}\mathbb{I}_{{\mathbb{R}}^{+}}(y)\\ &=\frac{1}{\Gamma \left( \alpha \right)\beta^{\alpha}}\exp\left\{{-\frac{y}{2\sigma^{2}}\frac{1}{\beta}}\right\}\frac{y^{{\alpha-1}}}{2^{\alpha}\sigma^{2\alpha}}\mathbb{I}_{{\mathbb{R}}^{+}}(y) \end{align*} This shows that

  1. This is a standard $\mathcal{G}(\alpha,2\sigma^2\beta)$ model, i.e. you observe $$(x_1^2,\ldots,x_n^2)=(y_1,\ldots,y_n)\stackrel{\text{iid}}{\sim}\mathcal{G}(\alpha,\eta);$$
  2. the model is over-parametrised since only $\eta=2\sigma^2\beta$ can be identified;
  3. EM is not necessary to find the MLE of $(\alpha,\eta)$, which is not available in closed form but solution of$$\hat\eta^{-1}=\bar{y}/\hat{\alpha}n\qquad\log(\hat{\alpha})-\psi(\hat{\alpha})=\log(\bar{y})-\frac{1}{n}\sum_{i=1}^n\log(y_i)$$ where $\psi(\cdot)$ is the di-gamma function. This paper by Thomas Minka indicates fast approximations to the resolution of the above equation.
Related Question