I'm trying to get the weighted average of CpM (Cost per Mile) for the dataset below:
I'm calculating the average by:
1st approach: Average CpM = sum(TotalCost / Miles) / count = (2 + 2.85 + 3.1 + 2.9) / 4 = $2.715
However, I've seen a colleague calculate the weighted average CpM the following way:
2nd approach: Average CpM = sum(TotalCost) / sum(Miles) = (100 + 200 + 310 + 290) / (50 + 70 + 100 + 100) = $2.81
When we introduce a new outlier that has extremely high CpM to the dataset, the 1st approach tends to shoot up along with the outlier, while the 2nd is more robust to the outlier:
1st approach: Average CpM = sum(TotalCost / Miles) / count = (2 + 2.85 + 3.1 + 2.9 + 10) / 5 = $5.21
2nd approach: Average CpM = sum(TotalCost) / sum(Miles) = (100 + 200 + 310 + 290 + 100) / (50 + 70 + 100 + 100 + 10) = $3.03
I'm not sure if the 2nd approach of calculating Average CpM is a valid approach, and if not, what this formula means and what it's getting at. I also wonder if there are any other ways to calculate the weighted arithmetic mean.
Best Answer
This is merely the simple average of the four CpM values, not any weighted average.
This is the arithmetic mean of the four CpM values weighted by their mileage.
$$\frac{m_1}{m_1+m_2+m_3+m_4}\left(\dfrac{c_1}{m_1}\right)+\frac{m_2}{m_1+m_2+m_3+m_4}\left(\dfrac{c_2}{m_2}\right)+\frac{m_3}{m_1+m_2+m_3+m_4}\left(\frac{c_3}{m_3}\right)+\frac{m_4}{m_1+m_2+m_3+m_4}\left(\frac{c_4}{m_4}\right)\\=\boxed{\frac{c_1+c_2+c_3+c_4}{m_1+m_2+m_3+m_4}}.$$
It gives the representative (i.e., 'average') value of the four CpM values such that the trips with higher mileage are given more weight than the trips with lower mileage.
If you require more intuition, please skim the middle section of this answer: Calculating marks using simple average versus weighted average.
You could also compute the arithmetic mean of the four CpM values weighted by their associated costs, but not useful.