I'm developing a website at the moment.
The website allows users to "rate" a post from 0 to 5.
Posts can then be displayed in order of popularity.
At the moment, my method of calculation is pretty primitive:
average_rating = total_rating/ratings
the problem is that a story with 1 rating of 5 is more popular than a story with 99 ratings of 5 and 1 of 4.
(5/1) > (499/100)
Could someone suggest a more accurate way to calculate popularity both on the number of votes and the quality of each vote?
Best Answer
A standard procedure (frequently -and loosely- called 'bayesian average') is to make a weighted average between the individual rating and the 'a priori' rating:
$R_a = W \; R + (1 - W ) \; R_0$
where
$R_a = $ averaged ('bayesian') rating
$R = $ individual rating: average rating for this item.
$R_0 = $ a priori rating: global average rating, for all items in your database.
$W = $ weight factor: it should tend to $0$ if this items has few votes, and it should tend to $1$ if it has many.
Some choices: $W = \frac{n}{N_{max}}$, or $W = max( \alpha \frac{n}{N_{av}},1)$ , etc ($n=$ number of votes for this item, $N_{max}=$ maximum number of votes for all items, $N_{av}=$average, $\alpha=$ some number between 0.5 and 1... ) Also, frequently one discards items that have very low/big values when computing the statistics.
See some examples
Added: for another approach, specially for yes/no like/diskike votes, see here.