I have roughly 39 data points that are values for % error. I have been researching and found that it is not correct to simply take the average of the % errors. Is this correct? If so, what is the correct way to take the average for % error values? All of the data values are weighted the same. Thanks!
[Math] How to take the average for percent error
percentages
Related Solutions
Let's say there are $n$ days, for which $h_i$ is the number of hours worked and $r_i$ is the number of dollars per hour for that day for $i \in \Bbb{N}$. $h_i$ is the weight on each day because days with more hours affect the average rate of pay more. Therefore, to find the weighted sum, we simply need to sum up all of the $r_i$s with a weight of $h_i$, which can be expressed as: $$\sum_{i=1}^n r_ih_i$$ Then, to find the weighted average, we need to divide this weighted sum by the total number of hours. The total number of hours is the sum of all $h_i$, or: $$\sum_{i=1}^n h_i$$ Thus, we just need to divide the first part by the second part: $$\frac{\sum_{i=1}^n r_ih_i}{\sum_{i=1}^n h_i}$$ Notice that this is exactly the same as total money divided by total hours. However, we are just looking at this process differently by looking at the $r_i$s as our objects and the $h_i$s as our weights.
Neither of these is correct, although they are both approximately correct if the errors are small. The exact answer is the following. If you're multiplying a quantity which you are uncertain about by a factor of $1 \pm p$ with a quantity which you are uncertain about by a factor of $1 \pm q$, then you are uncertain about their product by a factor of
$$(1 \pm p)(1 \pm q) = 1 \pm p \pm q + pq.$$
This is simple arithmetic. For example, if you have 10% uncertainty about the first quantity and 8% uncertainty about the second quantity, then you're uncertain about their product by a factor of
$$(1 \pm 0.1)(1 \pm 0.08) = 1 \pm 0.18 + 0.008.$$
Note that this is not symmetric about $1$; the upper bound is $1.1 \times 1.08 = 1.188$ and the lower bound is $0.9 \times 0.92 = 0.828$. You can say conservatively that the uncertainty in the product is $1 \pm 0.188$, or 18.8%, but this loses a little bit on the lower end.
If $p$ and $q$ are small then $pq$ is very small and $\pm p \pm q + pq$ is approximately $\pm p \pm q$, which is where "add the relative errors" comes from, but it's worth knowing that this is an approximation that breaks down if $p$ and $q$ are not small (or if you are multiplying many terms).
If you read the second link more carefully it tells you to compute the square-root-of-sum-of-squares for the absolute error of a sum (at least I think that's what it's doing; it's not very clear since the term "SE" hasn't been defined at all); this isn't a method being suggested for the relative error of a product at all. The main reason you'd compute the error this way is if you have reason to believe that your errors are well-modeled by independent normal / Gaussian distributions. This square-root-of-sum-of-squares behavior governs how independent Gaussians add: we have
$$N(0, \sigma_1) + N(0, \sigma_2) \sim N(0, \sqrt{\sigma_1^2 + \sigma_2^2}).$$
But this is a specific modeling assumption that may break down. The exact expression I give above gives worst-case bounds which are independent of the nature of the error as long as you actually know a bound on it.
Best Answer
You certainly can average the percent error values. That is a well defined operation. As Dilbert says, you can multiply them, too. Whether or not it expresses what you want can be very subtle. You are probably remembering the fact that averaging the percentage errors will not give the same result as dividing the average error by the average true value. If you explain carefully what you want to express, it will lead to the correct way to compute it.