Suppose we have a small set of numbers (5 to 10 observations), and we’re trying to fit a distribution to this set. Also, we know that all numbers are positive. I tried to fit lognormal, but I’m not sure how good my estimates are since the sample is very small; also I’m not sure how whether it is enough to look at goodness-of-fit test due to the small sample size.
Any suggestions on how to tackle this issue (i.e., to be more confident, certain, about my estimates)?
Best Answer
I would not recommend using a goodness of fit test for such small sample. For example, if you simulate $5-10$ observations from a log-normal distribution, then the Shapiro-Wilk normallity test would fail in the sense that the associated p-value would be higher than $0.05$ more than $30\% +$ of the times, failing to provide the desired power/signficance level. See the following R code.
You might consider Maximum Likelihood Estimation (MLE) and quantifying the accuracy of the estimation by constructing confidence-likelihood intervals for the parameters. One option consists of using the profile likelihood of the parameters $(\mu,\sigma)$.
In this case, the MLE of $(\mu,\sigma)$ for a sample $(x_1,...,x_n)$ are
$$\hat\mu= \dfrac{1}{n}\sum_{j=1}^n\log(x_j);\,\,\,\hat\sigma^2=\dfrac{1}{n}\sum_{j=1}^n(\log(x_j)-\hat\mu)^2.$$
Now, you can use the well-known result that a likelihood interval of level $0.147$ has an approximate confidence of $95\%$. The following R code shows how to calculate these intervals for $\mu$ and $\sigma$ numerically and how to plot the profile likelihoods for your sample.
I hope this helps.