[Math] average of averages vs sum of averages

averagestatistics

Note, to potential "duplicate" claimants, there is a similar question posted here. However, 1. that post is actually asking a different question and 2. that question has removed the code in the OP and thus is difficult to follow. Either way, it does not answer my question.

Main Question

I have calculated some monthly averages. I want to find out which season has the strongest value for animal presence.

Do I sum the means of each month for each season

summer = jun average + jul average + aug average

Or do I find the average of those averages?

summer = jun average + jul average + aug average / 3

Which method is correct for finding out the season with the highest and lowest values?

Context to aid answering the question is provided below

Details

Say we have a 4×4 square with 16 cells.

We lay this square on the on a beach and measure the presence of animals in each cell.

Weekly Statistics

We fill in each cell with the following rule

  • If a animal is present in a cell, cell value = 1
  • if the cell is empty (animal does NOT appear), cell value = 0

This results in a cell like so,
enter image description here

Monthly Totals

We repeat this each week of each month. We add up the weekly quadrats to create month summary, where each cell has a number between 0 and 4.

  • 0 = no animal present in any of the weeks
  • 1 = animal present in 1/4 of the weeks
  • 2 = animal present in 2/4 of the weeks
  • 3 = animal present in 3/4 of the weeks
  • 4 = animal was present in every week

    value count
    1 0 2
    2 1 3
    3 2 3
    4 3 4
    5 4 3

enter image description here

Monthly Statistics

sum = sum of count (i.e. 1+4+2+1+0+1: sum of cells…)
mean = sum / # of rows (i.e. # of weeks + 1)


sum mean
jan 10 2
feb 23 4.6
mar 45 9
apr 15 3

Summary Question

Do I sum the averages of each month for each season?

summer = jun average + jul average + aug average

Or do I find the average of those averages?

summer = jun average + jul average + aug average / 3

Which method is correct for finding out the season with the highest and lowest values?

Desired output…a value showing which season generally has more animals. Should I sum the monthly averages or average the monthly averages?


season value
1 autumn 85
2 spring 40
3 summer 62
4 winter 70

Best Answer

As @lulu pointed it out, as long as your seasons all have the same number of months, it is exactly equivalent to compute the sum of averages or the average of averages.

Example

Let say you compute the two indicators for summer:

strength_sum summer = n_june + n_july + n_august = 62
strength_avg summer = (n_june + n_july + n_august) / 3 = 20.67

And for winter:

strength_sum winter = n_december + n_january + n_february = 70
strength_avg winter = (n_december + n_january + n_february) / 3 = 23.33

Then strength_sum summer < strength_sum winter is equivalent to strength_avg summer < strength_avg winter (just by multiplying by 3). In both cases, the summer had less animals than the winter.

But what if...

If your seasons has a different number of months, or if you want to generalize to other time periods, I think that using the average of averages is more meaningful, regarding your issue.

Imagine you were in the Alps mountains, where the summer and autumn are way shorter than winter. It makes sense to favor the average of averages in order to correctly compare a 2-month long summer with a 5-month long winter.

For example, if you use the sum of average criterion: if strength_sum summer = 20 * 2 = 40 (n = 20 for each month of summer) and strength_sum winter = 10 * 5 = 50 (n = 10 for each month of winter), you don't want the winter to "win" simply because it's more than twice as long as summer : strength_sum summer = 40 < 50 = strength_sum winter, but strength_avg summer = 20 > 10 = strength_avg winter. It seems to me than strength_avg makes more sense here.

Related Question