Solved – Comparing approaches of MLE estimates of a Weibull distribution

fittingmaximum likelihoodweibull distribution

I need to parameterize a Weibull distribution to some data. Therefore, I use the Maximum-Likelihood-Estimation (MLE) from the fitdistrplus package in R. However, I wanted to understand what is done in the package, so besides using the package I tried two manual solutions to check the MLE estiamtes given by fitdist.

Summarizing, my approaches are:

(i) Use the fitdist function with method "MLE"

(ii) Solve the partial derivatives of the likelihood function

(iii) Minimize the negative likelihood using the optim function

First, simulate some data:

n <- 1e4    
set.seed(1) 
dat <- rweibull(n, shape=0.8, scale=1.2)

Approach 1: Apply the fitdistrplus package:

library(fitdistrplus)
A1 <- fitdist(dat, "weibull", method="mle")$estimate
A1
    shape     scale 
0.7914886 1.2032989 

Approach 2:

Having as Weibull density

,

the partial derivatives are:

Search for the roots of the partial derivatives above:

weib1 <- function(c) { 1/c - sum(dat^c*log(dat))/sum(dat^c) + 1/n*sum(log(dat)) }
shape <- uniroot(weib1, c(0,10), tol=1e-12)$root  
scale <- (1/n*sum(dat^shape))^(1/shape)
A2    <- c(shape, scale)
A2
[1] 0.7914318 1.2033179

Approach 3: Search for the parameters that minimize the negative log-likelihood:

fobj <- function(params){
  -sum(log(dweibull(dat, params[1], params[2])))
}
A3 <- optim(c(0.5, 1), fobj)$par
A3
[1] 0.7913756 1.2032748

Comparing the approaches, the parameter estimates (A1,A2,A3) differ in the fourth decimal place. Considering the fitdist documentation, I would have expected that A1 and A3 yield the same estimates, as both use optim.

Hence my questions are:

What is the objective function that is used by fitdist and how could I change approach 3 to yield exactly the same estimates as fitdist? And in general, what would be the preferred approach, I assume that solving the partial derivatives is the cleanest approach?

Best Answer

Weibull parameter estimation is typically done with gradient-descent-related algorithms. As far as I know most packages implements this by doing a location-scale transformation and then running the procedure on the resulting Gumbel-log-likelihood.

Check related

Related Question