Sequentially updating the variance (or higher order moments) of a sample (online learning)

probabilitystatistics

In online learning, new data become available to us one after another. Computationally, it is desirable to have a method of learning about the distribution of the data without redoing all the calculations from scratch. So, I have been thinking that if we could sequentially learn the moments of the underlying distribution, we would get a lot of information about the data itself. It is easy to show that $\mu_{n+1}=\frac{n}{n+1}\mu_n+\frac{1}{n+1}x_{n+1}$. I'm trying to find a similar recursive relation for higher moments such as variance, skewness, kurtosis, etc.

I have found a not-very-satisfactory expression so far:

$$\sigma_{n+1}^2 =\frac{n(n-1)}{(n+1)^2}\sigma_{n}^2+\frac{2}{(n+1)^2}\left[ (x_1 – \mu_n)(x_1-x_{n+1})+\cdots+(x_n-\mu_{n})(x_n-x_{n+1})\right]$$

I wonder if this expression can be simplified further. Also, a generalization of this expression for higher moments seems not trivial at all. I know that it's possible to do this, even though I cannot really find anything about it online. One particular case that is interesting is when we're dealing with weighted means, weighted variances, etc. And usually weights are decaying exponentially (newer data have more importance for us and the importance of old data decreases fast).

Just to be clear, an accepted recursive relation must be of the form:

$$\mu_{k,n} = f(\mu_{k,n-1},x_1,\cdots,x_{n},\mu_1,\cdots,\mu_{k-1})$$
where $\mu_k$ is the k-th moment and $\mu_{k,n+1}$ denotes our estimate for the k-th moment after observing $x_{n}$. Any help is appreciated.

Best Answer

There are numerous well-known methods for one-pass, online computations of sample mean and variance, weighted and unweighted, of varying levels of stability; for instance, Welford's algorithm. The linked Wikipedia page also has formulas for 3rd and 4th order moments, as well as a reference to the paper Formulas for Robust, One-Pass Parallel Computation of Covariances and Arbitrary-Order Statistical Moments by Philippe Pebay, which derives generalizations of Welford's method to higher order moments.

Related Question