Solved – Expected value of small sample

expected value

I've got product ratings for a few thousand products. The number of ratings for each product varies from zero to about fifty. I want to find the expected value of product rating for each product. If there are lots of ratings for the product I'd expect the expected value to be the average of the ratings for the product, but if there are only a few I'd expect the expected value to be closer to the average of all ratings. How do I calculate the true expected value? Please be gentle: I'm no statistician or mathematician.

Edit 1: Joris's answer below maintains I can't calculate expected value because by definition that means I must have the entire population. In that case please can you tell me how to calculate the quantity that is similar to expected value in spirit, does not require the entire population, and can make use of prior information.

Edit 2: I would expect that if each product's ratings have low variance ratings, or if there is a very high variance between different products' ratings, then the measured ratings are more significant.

Best Answer

Incorporating a prior is one way to 'make up' for small samples. Another is to use a mixed model, with an intercept for the mean structure and a random intercept for each product. The estimate of the population mean plus the predicted random effect (BLUP) then offers a form of shrinkage, where values for products with less information are shrunk more toward the overall sample mean than those based on more information. This method is common in, for example, Small Area Estimation in survey sampling.

Edit: The R code might look like:

library(nlme)
f <- lme(score ~ 1, data = yourData, random = ~1|product)
p <- predict(f)

If you go this route the assumptions are:

  • independent, normal errors with expected value 0 and constant variance for all observations
  • normal random effects with expected value 0

Violations of these can generally be modeled, but of course with that comes added complexity...

Related Question