Calculate average on set of data points where recent data has more weight

averagedata analysis

Take for example this dataset:

[1,1,1,1,1,1,1,1,1,200,1,1,1,1,....,1]

I want to calculate a running average on this with a certain window size for avg, let's say 10 most recent data point.

A regular average with a window of 10 (or whatever) would look something like this:

regular avg with window

The bump is when the 200 enters the calculation and the cliff is when it leaves the window.

I want it to look like this:

what I want it to look like

I believe this means I need to give more weight to recent data to reduce the impact of the 200 point bump over time.

What is a good way to do this? I'm looking for pointers in the right direction.

Thanks!

Best Answer

What you might want to use as your weight given your graph is using a three-part piecewise function: $$ z=\begin{cases} 0 & x<0\\ 1-0.08x & 0\leq x\leq10 \\ 0 & 10\leq x \end{cases} $$ where $x$ denotes days past and $z$ denotes the weight. The first part prevents returning a value to weigh anything that happens, the second makes the linear regression and the third creates the cutoff at the end of the graph. After finding the weight, you calculate the value of a certain point as you would for any arithmetic mean. For example, for $2$ days past the $200$ value, you would find $z$ for the $200$ point as $1-0.08\cdot2=0.84$, and the weights for the $1$s being $1,0.92,0.76,0.68$ e.t.c.

Related Question