Solved – Estimating distribution parameters from few data points

bayesianestimationnormal distributionuncertainty

Say I'm doing stats on the height of adults from various countries.

I assume the heights of adults from one country are normally distributed, and ignore sex differences (I also ignore the fact that neighbouring countries tend to have similar populations).

I have a bunch of data by country, but for some countries I have very few data points, which can lead to quite big errors on my estimate of standard distributions.

Is there a way I can use the data from countries for which I have a lot of data to get better estimates – say if I notice standard deviation in those countries is always between 7 and 8.5 cm, but my dataset for Nepal (for which I have 9 samples) has a standard deviation of 9.5 cm, I should probably correct that downwards. But how? Is there a formula for this?

When calculating the parameters for Nepal, shouldn't the data from the other countries allow me to have a "prior" distribution of expected means and deviations, which I would then update by taking the actual data from Nepal into account? How would I formalize this methodology?

(I got this while looking for a simple reduction of the problem that prompted previous question, which didn't get an answer yet – I'm still mostly looking for good methodologies for thinking about this kind of problem).

Best Answer

The correct answer here is hierarchical modeling (also called multilevel modeling). What you want to do is have the variance parameters drawn from a common prior distribution who's parameters are also estimated. Something like

CountryVariance_i ~ D(Location, Scale)

Location ~ D_2(LocationPriorParams)

Scale ~ D_3(ScalePriorParams)

Where D, D_2, D_3 are whatever distributions would like.

This formalizes the notion that the variance parameters should be similar, and will do the correct kind of shrinkage. I believe that Gelman's books (Bayesian Data Analysis and Data Analysis Using Regression and Multilevel/Hierarchical Models) talk quite a bit about this kind of thing (though perhaps not a lot for variance parameters).

I don't have a great introduction on hand, so I recommend looking at the wikipedia page and searching for "Hierarchical modeling" or something and clicking on various introductions (there are many). Look for graphs as these are helpful for understanding hierarchical models.

Related Question