[Math] How to recursively approximate a moving average and standard deviation

approximationrecursive algorithmsstatistics

Consider a sequence of measurements $(x_1, x_2, …)$. Let $\mu_n$ be the $p$-period moving average defined by $$\mu_n = \frac{1}{p}\sum_{i=n-p+1}^nx_i$$ and $\sigma_n$ be the $p$-period moving standard deviation defined by $$\sigma_n=\sqrt{\frac{1}{p}\sum_{i=n-p+1}^n (x_i – \mu_n)^2}$$

I would like to find the best approximation of $\mu_{n+1}$ and $\sigma_{n+1}$, given only $\mu_n$, $\sigma_n$, and $x_{n+1}$.

I believe that the (a?) best approximation for $\mu_n$ would be $$\mu_{n+1}\approx \frac{(p-1)\mu_{n} + x_{n+1}}{p}$$

Call this (or whatever other approximation you use if there is a better one) $\tilde{\mu}_{n+1}$. How well could you approximate $\sigma_{n+1}$? Would

$$\sigma_{n+1}\approx \sqrt{\frac{(p-1)\sigma_n^2 + (x_{n+1}-\mu_n)^2}{p}}$$

work?

Also, is it possible to quantify how well these approximations would perform recursively? Suppose I begin with a good estimate of the moving average and set those as $\tilde{\mu}_0,\tilde{\sigma}_0$. Now, using these approximations recursively, would the approximations be good?

Best Answer

Consider the dataset $(x_{n-p+1}, x_{n-p+2}, \ldots, x_n)$, which $\mu_n$ and $\sigma_n^2$ are the mean and variance of. To go from $n$ to $n+1$ you remove the first element $x_{n-p+1}$ and append $x_{n+1}$. Something like your approximations can be obtained by assuming that, instead of $x_{n-p+1}$, you randomly choose one of the $p$ elements with equal probability and remove it.

Let $s_n = \sum_{i=n-p+1}^n x_i = p\mu_n$ and $t_n = \sum_{i=n-p+1}^n x_i^2 = p(\mu_n^2 + \sigma_n^2)$ be the sum and the sum of squares of the original dataset. Denote the removed element by $\hat x$. Then the new totals after removing $\hat x$ and adding $x_{n+1}$ are $$\hat s = s_n - \hat x + x_{n+1}, \\ \hat t = t_n - \hat x^2 + x_{n+1}^2. $$ Using the fact that $\mathbb E[\hat x]=\mu_n$ and $\mathbb E[\hat x^2]=\mu_n^2+\sigma_n^2$, we can show that $$\mathbb E[\hat s] = (p-1)\mu_n + x_{n+1}, \\ \mathbb E[\hat s^2] = \big((p-1)\mu_n + x_{n+1}\big)^2 + \sigma_n^2, \\ \mathbb E[\hat t] = (p-1)(\mu_n^2 + \sigma_n^2) + x_{n+1}^2.$$ Finally, the expected mean and variance of the new dataset is $$\mathbb E[\hat\mu] = \mathbb E\left[\frac{\hat s}p\right] = \frac{(p-1)\mu_n + x_{n+1}}p, \\ \mathbb E[\hat\sigma^2] = \mathbb E\left[\frac{\hat t}p - \frac{\hat s^2}{p^2}\right] = \frac{(p^2-p-1)\sigma_n^2 + (p-1)(x_{n+1} - \mu_n)^2}{p^2}.$$ The expression for the mean is the same as your formula, but the expression for the variance is slightly different. It's also worth noticing that the update rule for the mean is simply exponential smoothing with $\alpha=1/p$.

Alternatively, instead of assuming a random-selection model and computing expected values, you could derive explicit bounds on $x_{n-p+1}$ using Chebyshev's inequality, and obtain bounds on $\mu_{n+1}$ and $\sigma_{n+1}^2$ from there. However, if you're going to do this recursively, the bounds will explode very very quickly.

You're aware that it is quite easy to compute rolling means and variances exactly, yes?

Related Question