Solved – Why does this data throw an error in R fitdistr

fittingr

I'm trying to fit a weibull distribution to this but am having problems. Not sure why. What causes the NaNs?

temp <- dput(temp)
c(477.25, 2615.56, 1279.98, 581.57, 13.55, 80.4, 6640.22, 759.46, 
1142.33, 134, 1232.23, 389.81, 7811.65, 992.11, 1152.4, 3139.01, 
2636.78, 3294.75, 2266.95, 32.12, 7356.84, 1448.54, 3606.82, 
465.39, 950.5, 3721.49, 522.01, 1548.62, 2196.3, 256.8, 2959.72, 
214.4, 134, 2307.79, 2112.74)


fitdist(temp, distr = "weibull", method = "mle")
Fitting of the distribution ' weibull ' by maximum likelihood 
Parameters:
          estimate  Std. Error
shape    0.8949019   0.1205351
scale 1803.8816283 357.9042207
Warning messages:
1: In dweibull(c(477.25, 2615.56, 1279.98, 581.57, 13.55, 80.4, 6640.22,  :
  NaNs produced
2: In dweibull(c(477.25, 2615.56, 1279.98, 581.57, 13.55, 80.4, 6640.22,  :
  NaNs produced
3: In dweibull(c(477.25, 2615.56, 1279.98, 581.57, 13.55, 80.4, 6640.22,  :
  NaNs produced
4: In dweibull(c(477.25, 2615.56, 1279.98, 581.57, 13.55, 80.4, 6640.22,  :
  NaNs produced
5: In dweibull(c(477.25, 2615.56, 1279.98, 581.57, 13.55, 80.4, 6640.22,  :
  NaNs produced
6: In dweibull(c(477.25, 2615.56, 1279.98, 581.57, 13.55, 80.4, 6640.22,  :
  NaNs produced
7: In dweibull(c(477.25, 2615.56, 1279.98, 581.57, 13.55, 80.4, 6640.22,  :
  NaNs produced
8: In dweibull(c(477.25, 2615.56, 1279.98, 581.57, 13.55, 80.4, 6640.22,  :
  NaNs produced
9: In dweibull(c(477.25, 2615.56, 1279.98, 581.57, 13.55, 80.4, 6640.22,  :
  NaNs produced
10: In dweibull(c(477.25, 2615.56, 1279.98, 581.57, 13.55, 80.4, 6640.22,  :
  NaNs produced

Best Answer

The Weibull distribution has two parameters, the scale $\lambda$ and shape $k$ (I'm following Wikipedia's notation). Both parameters are positive real numbers.

The function fitdist from the fitdistrplus package uses the optim function to find the maximum likelihood estimations of the parameters. By default, optim imposes no constraints on the parameters and tries out negative numbers as well. But negative values for the scale or shape produce NaNs for the Weibull distribution. By using the options lower and upper, you can impose limits on the parameter search space for optim.

The gamma distribution also has two parameters and as with the Weibull distribution, both are positive. So the same limits lower = c(0, 0) can be used for the gamma distribution.

Edit

Here is a small comparison of the Weibull and gamma fit for the posted data. The errors for the gamma distribution arise because of bad starting values. I provide them manually and then it works fine without errors.

library(fitdistrplus)

temp <- c(477.25, 2615.56, 1279.98, 581.57, 13.55, 80.4, 6640.22, 759.46, 
          1142.33, 134, 1232.23, 389.81, 7811.65, 992.11, 1152.4, 3139.01, 
          2636.78, 3294.75, 2266.95, 32.12, 7356.84, 1448.54, 3606.82, 
          465.39, 950.5, 3721.49, 522.01, 1548.62, 2196.3, 256.8, 2959.72, 
          214.4, 134, 2307.79, 2112.74)

fit.weibull <- fitdist(temp, distr = "weibull", method = "mle", lower = c(0, 0))
fit.gamma <- fitdist(temp, distr = "gamma", method = "mle", lower = c(0, 0), start = list(scale = 1, shape = 1))

Plot the fit for the Weibull:

plot(fit.weibull)

Weibull fit

And for the gamma distribution:

plot(fit.gamma)

Gamma fit

They are practically indistinguishable. The AICs are virtually the same for both fits:

gofstat(list(fit.weibull, fit.gamma))

Goodness-of-fit statistics
                             1-mle-weibull 2-mle-gamma
Kolmogorov-Smirnov statistic    0.07288424  0.07970184
Cramer-von Mises statistic      0.02532353  0.02361358
Anderson-Darling statistic      0.20489012  0.17609146

Goodness-of-fit criteria
                               1-mle-weibull 2-mle-gamma
Aikake's Information Criterion      601.7909    601.5659
Bayesian Information Criterion      604.9016    604.6766
Related Question