You mention two problems: 1. lack of convergence of the fitting procedure for truncate distributions; 2. choice of a suitable distribution for your data.
Regarding point 1: the lack of convergence is in your case due to the fact that the "true" solution of the maximum likelihood optimization problem lies too far from the starting parameter values provided. You can see this by playing with the lower bound a
of the truncation:
## select a few lower bounds
aa <- c(-10,-1,-0.5,-0.2,-0.15)
## fit by MLE for the various lower bounds
fits <- lapply(aa, function(a){
fitdist(testData, "truncnorm", fix.arg=list(a=a),
start = list(mean = mean(testData), sd = sd(testData)))
})
## fit by BFGS for lower bound a=0
fit0.BFGS <- fitdist(testData, "truncnorm", fix.arg=list(a=0),
start = list(mean = mean(testData), sd = sd(testData)),
optim.method="L-BFGS-B", lower=c(0, 0))
## quantile-quantile plotting utility function
qqpl <- function(fit, lims=c(-0.8,3.5), del=0.05){
a <- fit$fix.arg[[1]]
fitmean <- fit$estimate["mean"]
fitsdev <- fit$estimate["sd"]
probs <- seq(del, 1-del, by=del)
qempir <- quantile(testData, probs)
qtheor <- qtruncnorm(probs, a=a, mean=fitmean, sd=fitsdev)
tit <- paste0( c("a=",": mean=",", sd=", "; AIC="),
round(c(a,fitmean,fitsdev,fit$aic),3),
collapse="")
if(!is.null(fit$dots)) tit <- paste0(tit, " (method=",fit$dots$optim.method,")")
plot(qempir, qtheor, xlim=lims, ylim=lims,
xlab="Empirical", ylab="Theoretical", main=tit)
abline(a=0, b=1, lty=2)
}
par(mfrow=c(2,3))
lapply(fits, qqpl)
qqpl(fit0.BFGS)
The MLE can be made to converge by choosing a large negative a
: in this case, a positive mean is estimated for the truncated normal. When increasing a
towards zero, the estimated mean becomes increasingly negative and the estimated standard deviation increases: in other words, one is obtaining truncated normal distributions which are increasingly "heavily truncated", in the sense that the truncation occurs at increasingly large upper quantiles of the distribution.
However, both the quantile plots and the AIC indicate that such increasingly heavily truncated normal distributions actually fit the data increasingly better. I have already observed this phenomenon when fitting truncated distributions to other datasets: to better adjust to the data, the truncated distributions can get "streched out".
One possibility to address problem 1 (lack of convergence) is to use BFGS with a=0
; however, this solution does not look particularly good, either according to the quantile-quantile plot or to the AIC which is just slightly smaller than MLE for a=-0.15
.
Another trick is to provide "smart" starting point: for example, you can still use MLE for a=0
but providing the estimated parameter values obtained for a different value of a
, for example for a=-0.15
:
fit0.start <- fitdist(testData, "truncnorm", fix.arg=list(a=0),
start = as.list(fits[[4]]$estimate))
This often fixes the convergence problems. However, to address point 2 (which is the most important, IMHO), you should try several other distributions and see which fits the best. As discussed above, the previous attempts yield heavily truncated normal distributions: this suggests to try a Generalised Pareto Distribution (which is used to model upper tails):
library(fExtremes)
fit <- gpdFit(testData, u=0)
##plot(fit, which=4)
par(mfrow=c(1,2))
lims <- c(0,3.5)
qqpl(fit0.start, lims=lims)
del <- 0.05
probs <- seq(del, 1-del, by=del)
qempir <- quantile(testData, probs)
param <- attr(fit, "fit")$par.ests
qtheor <- qgpd(probs, mu=0, xi=param['xi'], beta=param['beta'])
tit <- paste0( "GPD, AIC=", round(4-2*attr(fit, "fit")$llh,3) )
plot(qempir, qtheor, xlim=lims, ylim=lims,
xlab="Empirical", ylab="Theoretical", main=tit)
abline(a=0, b=1, lty=2)
The GPD actually provides a MUCH better fit than the normal truncated at zero. The fitted tail parameter is not large (0.35), suggesting that decent results might also be obtained by a Weibull or a lognormal. One last remark: your data looks a bit "suspicious"!
plot(testData)
The "funnelling" for the second half of the dataset might indicate heterogeneity (i.e., the data comes from different generating processes). This might suggest a mixture model, but I would not attempt that without knowing more about the data (collection, generating process, etc.).
Best Answer
The Weibull distribution has two parameters, the scale $\lambda$ and shape $k$ (I'm following Wikipedia's notation). Both parameters are positive real numbers.
The function
fitdist
from thefitdistrplus
package uses theoptim
function to find the maximum likelihood estimations of the parameters. By default,optim
imposes no constraints on the parameters and tries out negative numbers as well. But negative values for the scale or shape produceNaNs
for the Weibull distribution. By using the optionslower
andupper
, you can impose limits on the parameter search space foroptim
.The gamma distribution also has two parameters and as with the Weibull distribution, both are positive. So the same limits
lower = c(0, 0)
can be used for the gamma distribution.Edit
Here is a small comparison of the Weibull and gamma fit for the posted data. The errors for the gamma distribution arise because of bad starting values. I provide them manually and then it works fine without errors.
Plot the fit for the Weibull:
And for the gamma distribution:
They are practically indistinguishable. The AICs are virtually the same for both fits: