Solved – Is it better to take a mean of means or a grand mean when relaying rate data

arithmeticmean

I am working with a dataset that includes number of fish caught in a day, and number of hours spent fishing that day. The variable of interest (outside of general summary statistics) is the number of fish caught per hour of effort, aka catch-per-unit-effort (CPUE).

I have several hundred days worth of data and have thus calculated a CPUE for each day of fishing. However, upon generating a global CPUE for the whole dataset (i.e. creating an arithmetic mean by summing the total number of fish caught divided by the total number of hours), my CPUE estimate is different than when I generate a mean of my daily CPUE estimates (which themselves are arithmetic means).

I understand that the grand mean and the mean-of-means will generate different answers. The mean-of-means yields higher numbers than the grand mean since days with no fish have a higher weight in the grand arithmetic mean than they do in the mean-of-means. However, I cannot think of a way to generate variance estimates of CPUE without doing the daily estimates, even if they over-estimate the mean.

So my question is as follows. Should I:

1) Report the grand arithmetic mean (grand total of fish divided by grand total of fishing hours, aka a grand CPUE), but then calculate sd and se based on daily CPUE estimates,

2) Report the grand arithmetic mean, standard deviation, and standard error based on a calculation I'm somehow not thinking about,

or

3) Report the CPUE, sd, and se as a "mean of means", aka the average of daily CPUE?

I am leaning towards option 3, as it at least uses concise methodology.

Best Answer

A ratio mean

Let Y = average of all $y_{i}$’s, the numerator, the daily fish caught

Let X = average of all $x_{i}$'s, the denominator, the daily time fishing

Then the estimated ratio mean is $R = Y/X$

The Variance of $R$ is estimated by:

$Var(R) = (Y/X)^2 (Var(Y)/Y^2 + Var(X)/X^2 - 2Cov(Y,X)/(YX))$

Both of these are biased but the estimator R is usually quite good compared with the others you are considering. The ratio, R, may be what you are calling the grand mean.

Related Question