Statistics – How to Calculate ‘Most Popular’ More Accurately

averagestandard deviationstatistics

I'm developing a website at the moment.

The website allows users to "rate" a post from 0 to 5.

Posts can then be displayed in order of popularity.

At the moment, my method of calculation is pretty primitive:

average_rating = total_rating/ratings

the problem is that a story with 1 rating of 5 is more popular than a story with 99 ratings of 5 and 1 of 4.

(5/1) > (499/100)

Could someone suggest a more accurate way to calculate popularity both on the number of votes and the quality of each vote?

Best Answer

A standard procedure (frequently -and loosely- called 'bayesian average') is to make a weighted average between the individual rating and the 'a priori' rating:

$R_a = W \; R + (1 - W ) \; R_0$

where

$R_a = $ averaged ('bayesian') rating

$R = $ individual rating: average rating for this item.

$R_0 = $ a priori rating: global average rating, for all items in your database.

$W = $ weight factor: it should tend to $0$ if this items has few votes, and it should tend to $1$ if it has many.

Some choices: $W = \frac{n}{N_{max}}$, or $W = max( \alpha \frac{n}{N_{av}},1)$ , etc ($n=$ number of votes for this item, $N_{max}=$ maximum number of votes for all items, $N_{av}=$average, $\alpha=$ some number between 0.5 and 1... ) Also, frequently one discards items that have very low/big values when computing the statistics.

See some examples

Added: for another approach, specially for yes/no like/diskike votes, see here.

Related Solutions

[Math] How to weight votes based on number of possible voters

This question has been plaguing internet ranking sites for a while, e.g. reddit. Fortunately it has a nice theoretical solution.

It involves a bit of statistics; the Ruby code is:

require 'statistics2'

def ci_lower_bound(pos, n, confidence)
if n == 0
return 0
end
z = Statistics2.pnormaldist(1-(1-confidence)/2)
phat = 1.0*pos/n
(phat + z*z/(2*n) - z * Math.sqrt((phat*(1-phat)+z*z/(4*n))/n))/(1+z*z/n)
end

How to add/subtract weight from ratings to get a weighted average rating

John Doe's comment was just what I needed, but since it's just a comment I can't choose it as the answer. Anyway, I wrote a brief function that gives more consistent results than my old code. It's also much simpler and shorter. numberOf is the number of songs with a rating, Rating is the rating, and bonus_penalty is how much I'm adding/subtracting based on distance from neutral_rating. I simply do this for each rating, add the results for each function call, and then divide by the number of songs.

I believe this is more-or-less what John Doe was proposing, and it appears to work as it should. Thanks!

Function ScoreForRating(numberOf As Integer, Rating As Integer, bonus_penalty As Double, neutral_rating As Integer) Dim adjustment As Double

adjustment = 1 + numberOf * ((Rating - neutral_rating) * bonus_penalty)
ScoreForRating = (Rating * numberOf) * adjustment

End Function

Best Answer

Related Solutions

[Math] How to weight votes based on number of possible voters

How to add/subtract weight from ratings to get a weighted average rating

Related Question