Standard Deviation – How to Calculate Standard Deviation of Multiple Measurements with Uncertainties in Time Series Data

meanstandard deviationtime series

I have two 2 hours of GPS data with a sampling rate of 1 Hz (7200 measurements). The data are given in the form $(X, X_\sigma, Y, Y_\sigma, Z, Z_\sigma)$, where $N_\sigma$ is the measurement uncertainty.

When I take the mean of all measurements (e.g. the average Z value of those two hours), what is its standard deviation? I can of course calculate the standard deviation from the Z values, but then I neglect the fact that there are known measurement uncertainties…

Edit: The data is all from the same station, and all coordinates are remeasured every second. Due to satellite constellations etc., every measurement has a different uncertainity. The purpose of my analysis is to find the displacement due to the an external event, (i.e. an earthquake). I would like to take the mean for 7200 measurements (2h) before the earthquake and another mean for 2h after the earthquake, and then calculate the resulting difference (in height for example). In order to specifiy the standard deviation of this difference, I need to know the standard deviation of the two means.

Best Answer

I suspect that the previous responses to this question may be a bit off the mark. It seems to me that what the original poster is really asking here could be rephrased as, "given a series of vector measurements: $$\vec{\theta}_{i} = \left( \begin{array}{c} X_{i} \\ Y_{i} \\ Z_{i} \end{array}\right)$$ with $i=1, 2, 3,...,7200$, and measurement covariance: $$C_{i} = \left( \begin{array}{ccc} X_{\sigma,i}^{2} & 0 & 0 \\ 0 & Y_{\sigma,i}^{2} & 0 \\ 0 & 0 & Z_{\sigma,i}^{2} \end{array} \right)$$ how would I correctly calculate the covariance-weighted mean for this series of vector measurements, and afterward, how would I correctly calculate its standard deviation?" The answer to this question can be found in a lot of textbooks specializing in statistics for the physical sciences. One example that I like in particular is Frederick James, "Statistical Methods in Experimental Physics", 2nd edition, World Scientific, 2006, Section 11.5.2, "Combining independent estimates", pg. 323-324. Another very good, but more introductory-level text, which describes the variance-weighted mean calculation for scalar values (as opposed to full vector quantities as presented above) is Philip R. Bevington and D. Keith Robinson, "Data Reduction and Error Analysis for the Physical Sciences", 3rd edition, McGraw-Hill, 2003, Section 4.1.x, "Weighting the Data--Nonuniform Uncertainties". Because the poster's question happened to have a diagonalized covariance matrix in this case (i.e., all of the off-diagonal elements are zero), the problem is actually separable into three individual (i.e., X, Y, Z) scalar weighted mean problems, so the Bevington and Robinson analysis applies equally well here too.

In general, when responding to stackexchange.com questions, I don't normally find it useful to repackage long derivations that have already been presented before in numerous textbooks--if you want to truly understand the material, and understand why the answers look the way they do, then you really should just go and read the explanations which have already been published by the textbook authors. With that in mind, I'll simply jump directly to re-stating the answers that others have already provided. From Frederick James, setting $N=7200$, the weighted mean is: $$\vec{\theta}_{mean} = \left( \sum_{i=1}^{N} C_{i}^{-1} \right)^{-1} \left( \sum_{i=1}^{N} C_{i}^{-1} \vec{\theta_{i}} \right) $$ and the covariance of the weighted mean is: $$ C_{mean} = \left( \sum_{i=1}^{N} C_{i}^{-1} \right)^{-1} $$ This answer is completely general, and will be valid no matter what the form of the $C_{i}$, even for non-diagonal measurement covariance matrices.

Since it so happens that the measurement covariances are diagonal in this particular case, the Bevington and Robinson analysis can also be used to calculate variance-weighted means for the individual $X_{i}$, $Y_{i}$, and $Z_{i}$. The form of the scalar answer is similar the form of the vector answer: $$ X_{mean} = \frac{\sum_{i=1}^{N} \frac{X_{i}}{X_{\sigma,i}^{2}}}{\sum_{i=1}^{N} \frac{1}{X_{\sigma,i}^{2}}} $$ and the variance is $$ X_{\sigma,mean}^{2} = \frac{1}{\sum_{i=1}^{N} \frac{1}{X_{\sigma,i}^{2}}} $$ or equivalently, $$ X_{\sigma,mean} = \sqrt{\frac{1}{\sum_{i=1}^{N} \frac{1}{X_{\sigma,i}^{2}}}} $$ and similarly for $Y_{mean}, Y_{\sigma, mean}$ and $Z_{mean}, Z_{\sigma, mean}$. A brief wikipedia entry which also arrives at this same answer for the scalar-valued case is available here.

Related Question