Solved – compute mean and standard deviation with units

meanstandard deviation

I apologize if the question is too simple but to me it's a bit difficult to see. I have certain data, let's say data in units of mass [kg] over a range of few years. I get the input of mass every 2 days during a period of 20 years. In theory the mass should be constant over time but it has a small variation each timestep. Thus, if I plot mass vs time I have a very noisy plot. In order to analyse the results in a better way, I would like to take the average of the mass every 2 years, so my average will have units of [mass/year]. Basically I want to spread the data over time. I also would like to associate a standard deviation to this average, but I'm not sure of how to compute it, due to the units I have. The equation for the mean is basically for this case (if I'm correct):

$\frac{\sum m_i}{Period = 2 yr}$

instead of the usual equation $\frac{\sum m_i}{N}$, where $m_i$ are the masses. For example using some data:

Time    Mass
[day]   [kg]
 2        3.5
 4        2.5
 6        3.7
 8        3.8
10        3.7
12        3.2
14        3.7
16        3.4
18        3.7
20        3.6

If I wanted to do the same but instead of a period of every two years, every 10 days then my first two points would be (using the first equation):

Period 1: 1.72 kg/day
Period 2: 1.76 kg/day. 

Now my question is how can I associate a standard deviation to each of the two past values?

I know that the standard deviation equation is:
$\sqrt{\frac{1}{N} \sum (m_i – \hat{m}_i)^2}$

But with my units I don't see how to compute this. Or am I doing this in a wrong way? I think I can also just compute first the mean of the masses, using the second equation ($\sum m_i/N$)i.e.

Mean:
Period 1: 3.43 kg
Period 2: 3.52 kg

and then also estimate the standard deviation

Std:
Period 1: 0.48 kg
Period 2: 0.19 kg

and in the end I just know from the last example that in 2 days the mean for period 1 is 3.43 in two days so in 1 day it will be only half, Period 1: 1.71 kg/day and for Period 2: 1.76 kg/day and I estimate the standard deviation in the same way. Basically my question is how to use the equation for standard deviation in this case where I'm averaging over time with units rather than number of counts.

Thanks!

Best Answer

The units of the average of a quantity is always the units of the quantity itself. Your mistake is in the way you're interpreting the average.

Suppose, we want to take the time-average value of a quantity (say "mass"), which is a function of time, $M(t)$ between times $t_1$ and $t_2.$ The time-average value is $$ \langle M\rangle = \frac{\int_{t_1}^{t_2} M(t) dt}{t2 - t1}. $$ The numerator has units of "mass*time" and the denominator has units of "time". Thus the overall units of the average is "mass".

What you're doing is making the approximation that the mass is constant over a period of two days. If $t2 - t1$ is a period of two days, with a constant value $m$, then the integral becomes $m (t_2 - t_1) / (t_2 - t_1) = m.$ You're still doing the integral above, but it's a piecewise integral of constant values in each of their time windows. That's the approximation you're making.

If we have times $t_0, t_1, t_2, \dots, t_N$ in equally spaced increments with $t_i - t_{i-1} = \delta t$, and approximate $M(t) \approx m_i$ for $t_{i-1} < t \le t_i$ then $$ \begin{split} \langle M\rangle &= \frac{1}{t_N - t_0}\int_{t_0}^{t_N} M(t) dt \\ &\approx \frac{1}{N \delta t} \sum_{i=1}^N m_i \delta t \\ &= \frac{1}{N} \sum_i m_i, \end{split} $$ which is just the sample mean. Likewise,

$$ \begin{split} Var(M) &= \frac{1}{t_N - t_0} \int_{t_0}^{t_N} (M(t) - \langle M \rangle )^2 dt \\ &\approx \frac{1}{N \delta t} \sum_{i=1}^N (m_i - \langle M \rangle)^2 \delta t \\ &= \frac{1}{N} \sum_{i=1}^N (m_i - \langle M \rangle )^2 \end{split} $$ which is just the sample variance.