Solved – Weighted average vs “unweighted” average in probability

meanprobabilityweighted mean

[UPDATE]
Terms I used:
– Weighted average: weighted arithmetic mean
– "Unweighted" average: arithmetic mean

I went through some of the forums here and looked for explanations online but I couldn't figure out the answer for this:

If percentages represent probabilities, what does weighted and "unweighted" average of a list of values represent?

Example:
There are two shops that sell vegetables and two that don't sell vegetables. All of them ask their customers for feedback on their shopping. They say either that they were happy or they were not. Let's assume that every customers answers.

Veg Shop 1 has 10 customers and 5 are happy. (50% average)
Veg Shop 2 has 20 customers and 8 are happy. (40% average)
Non-Veg Shop 1 has 15 customers and 5 are happy. (33% average)
Non-Veg Shop 2 has 25 customers and 10 are happy. (40% average)

Weighted average of Veg Shops is 43% (13 / 30), unweighted average is 45%.
Weighted average of Non-Veg Shops is 37.5% (15 / 40), unweighted average is 36.5%.

I know that with weighted average distribution matters and I feel that it implies that there is correlation between the values aggregated. However, in this example, if I count weighted average I accept that there is correlation, but I discard the correlation between Veg and Non-Veg shops.

I can't really phrase the probabilities, but it feels that unweighted average is an answer to "my expected performance (based on customer feedback) among other shops of the same type", and weighted average is an answer to "the chance of a customer being happy, when going to any of the shops of the same type".

It is still not really clear for me, though. Does anybody have some clearer ideas on what the probabilities would mean?

Also, is it possible at all to assign meaningful values to both weighted and unweighted average on the same dataset?

Thanks,
Norbert

Best Answer

What you are calling the "weighted average" is the only proper way to calculate the percentage of satisfied customers in both shops. Taking average of percentages (what you call "unweighted" average) will give you useless results if your samples differ in size.

Imagine extreme case: you have two shops A and B, in shop A there was one customer and he was not satisfied and in shop B there were 100 customers and 90 of them were satisfied -- would you conclude that 45% of customers of both shops were satisfied? Obviously not!

The smaller sample is much less reliable, so it should not be included in the final estimate with the same weight as the larger one. Speaking more formally, estimates from the smaller sample has a greater error.

Two shops: A and B, in A there are two customers, one happy, in B, there were 100 and 75 were happy. So they have 50% and 75% happy customers, so the average is 62.5%. Using your example from the comment:

Standard error for the estimate from the first shop is $\frac{0.5(1-0.5)}{\sqrt 2} = 0.18$, while for the second one $\frac{0.75(1-0.75)}{\sqrt{100}} = 0.02$. So the possible deviation from the true proportion of satisfied customers in shop A is much larger. If the samples differ that much in their reliability, you shouldn't give them equal trust and weight them equally when taking their average.

You can easily convince yourself why using pooled mean is a much better idea than using the raw one by conducting a simple simulation study. I simulate two shops, with $n=8$ and $n=100$ customers, both having the same proportion of satisfied customers. If you compare the results obtained using raw mean ("unweighted") and the pooled ("weighted") estimates, you'll see that when using the raw mean the variability of the difference between the estimate and the true value is much greater. Saying this in simple English: using raw mean you are at greater risk of obtaining the wrong estimate than in the case of pooled mean.

enter image description here

For much more advanced methods of pooling different probabilities, you can check Combining probabilities/information from different sources thread or this one Combining two estimates .

Related Question