[Math] continuously averaging a stream of numbers

algebra-precalculusarithmeticaverage

Ok, so I'd like a little help with the mathematical theory behind something I'm doing in a program. I've got a stream of numeric inputs, and I need to find the average of all the numbers in this stream of data. I'm going to output it once the stream ends.

In my code, my program can remember two numbers, and is aware of the next number in the stream.

Let's say I have x, y, and z.

x is the current average of the stream
y is the count of numbers that have been processed from the stream
z is the next number in the stream.

How could I do a fair-weighted average of a stream like this? If I just add z to x and then divide x by 2, then the most recent numbers in the stream heavily weight the average. Or is this even possible?

Best Answer

Your running sum is $xy$, so when a new number $z$ comes in the running sum becomes $xy+z$ and the count becomes $y+1$. The new average is $\frac {xy+z}{y+1}$. It is easier to store the running sum than the average, and update the sum and count each time a new number comes in.

Added: three things you could consider: 1) Keep the running sum and count. When the running sum gets close to overflow, divide both by 2. Now start accumulating again. This will weight the new values twice as much as the old ones, but maybe that is OK. 2) If you know about what the average is, subtract a guess from all the data. You are storing the difference from your guess. If your guess is close, the running sum will stay near zero, or at least not overflow nearly as quickly. 3) Maybe overweighting the new data isn't so bad. If the average changes with time, maybe you want the recent data overweighted. One approach is to have the current average stored in $x$, then when $z$ comes in, make the new average $x=\frac {a*x+z}{a+1}$. The larger $a$ is, the slower the average will change. You are weighting the new data as $\frac 1a$ of the old average. This should be safe from overflow.

Related Question