Solved – Calculating mean of continuous time series

meantime seriesweighted mean

I'm calculating the arithmetic mean of values in a continuous time series, weighted by duration. Points in the time series are not guaranteed to be evenly spaced.

Example data:

Time    Value
0       1
1000    2
2000    3
3000    4
5000    5

Where Time is the duration since the start of the time series.

My original approach to calculate the mean was to calculate the mean between adjacent points, assign a weight to that value based on the duration between the adjacent points and total duration, and sum the results:

Mean    Weight    Normalized weight   Result
1.5     1000      0.2                 0.3
2.5     1000      0.2                 0.5
3.5     1000      0.2                 0.7
4.5     2000      0.4                 1.8
                                      ---
                                      3.3

Resulting in a mean of 3.3

Thankfully I stumbled across the MATLAB documentation for calculating the mean of timeseries data, which suggests my method is probably incorrect.

Their method assigns weights as such:

  1. Assign a weight to each point's value; first point is the duration of the first time interval t(2) - t(1), last point is the duration of the last time interval t(end) - t(end-1), and points that are neither the first or last points are the duration between the midpoint of the previous time interval and the midpoint of the next time interval.

  2. Normalize the weighting for each time by dividing each weighting by the mean of all weightings.

  3. Multiply the values for each point by its normalized weighting.

This results in the following:

Value   Weight   Normalized weight   Result
1       1000     0.153846            0.153846
2       1000     0.153846            0.307692
3       1000     0.153846            0.461538
4       1500     0.230769            0.923076
5       2000     0.307692            1.53846
        ----     --------            --------
        6500     1                   3.384612

Resulting in a mean of 3.384612

My question is: Why are the weights in the MATLAB documentation calculated in the way they are? Specifically, why are the first and last points assigned the full time interval between the next and previous points, respectively?

If the first and last points were only assigned half (i.e. to the midpoint), the resulting calculated mean is the same as my original approach. Is my original approach wrong?

Best Answer

Rather than being right or wrong, here Matlab assumes that your start and end samples also extend towards to their left and right by equal amounts. So, according to Matlab, your signal starts from t=-0.5 and ends at t=6. This is 0-th order interpolation technique, in which every sample extends to its left or right by equal amounts.

Related Question