Bayesian – Using Partial Information About Distribution for Better Estimation

bayesianestimationnormal distribution

I learned college statistics but only at the undergraduate level. I feel like it's hard to apply what I learned to real-life situations, unlike self-contained problem sets.

Here's what I'm trying to do:

I want to estimate the population's distribution with some pieces of information I gathered that are not sufficient to estimate the distribution perfectly.

I took a test and I have some information like:

  • There were 59 students. (n=59)

  • Mean is 75

  • Top score was 96

  • The lowest was 48

  • There are 13 students within 60 <= x < 75 range

  • There are 3 students under 60 (x < 60)

If I assume a normal distribution for the population, I already have the mean so now have to know the standard deviation of the population to get the distribution.

However, instead of standard deviation, I got these bunch of partial information that seem useful but don't know how to actually use them to calculate the population distribution.

Is there a standardized way to update my knowledge of the population based on this little partial information? I feel like this should be related to Bayesian update but this seems totally different from what I read in the textbook.

Best Answer

  1. A standard approach to a question like this, where there's a distributional model*, would be to attempt to write down the joint distribution of the available statistics under the model, and then use standard inferential theory to obtain estimates of the parameters that use all the available information in those statistics.

    I'm ignoring the sample proportion of values below the mean; the mean and the normal assumption give us all the information about the population proportion there.

    However, since the maximum likelihood estimate of $\mu$ is just the mean, we might seek to simplify the problem somewhat by using that and removing it from the quantities we seek to estimate. If we replace one parameter by its estimate ($\mu$ by $\hat{\mu}$ in this case), we have a problem of estimating he remaining ones ($\sigma$, or perhaps $\sigma^2$ if that turns out to be more convenient) from the available statistics, essentially a profile likelihood approach.

    Now this is a univariate problem we can attempt to compute the likelihood directly for a grid of values of $\sigma$ (in this case we have an obvious upper bound for $\sigma$ since the response is bounded and can obtain an initial guess from the proportion less than 60) and identify where the maximum is - potentially using a finer grid or an optimizer once the vicinity of the optimum is identified.

    Note that with the extrema identified, knowing the mean tells us the sum of the remaining order statistics.

  2. If the assumption were dubious in the far tail (which impacts the distribution of the extreme order statistics) but was of less concern near the middle, one might drop consideration of the minimum and maximum and estimate $\sigma$ purely on the basis of the proportion of values below 60 (which given the normality assumption and independence reduces to a calculation based off binomial).

    This yields $\hat{\mu}=75$ and $\hat{\sigma}=9.165$ (from solving $\Phi(\frac{60-75}{\hat{\sigma}})=\frac{3}{59}$) , but the latter estimate would be subject to a large uncertainty.

    Even where the more complicated approach was attempted, this would be a suitable way to get a starting estimate for $\sigma$, as mentioned above.

  3. We could also try to bound the variance without the normality assumption, which may sometimes be useful.

* taking the model as a given for now, and presuming a preference for a frequentist style approach -- for all that you have tagged it as "Bayesian" you don't present it as a Bayesian question -- there's no hint of priors on the parameters, for example. Consequently I take the mention of Bayesian updating as simply describing your own impression, not a specific request.