[Math] Standard deviation of a rate

standard deviationstatistics

I am trying to measure the throughput of a network. I collected sample transfer times for 1,000 trials, and then calculated the arithmetic mean and standard deviation for the data (e.g., mean = 0.2s, stdev = 0.01s for 1GB of data).

It is easy to compute the mean throughput (e.g., 1GB / 0.2s = 5 GB/s). Is it possible to compute the standard deviation of the throughput directly from the standard deviation of the transfer time? Or do I need to use the original 1,000 data points (which would unfortunately need to be re-measured)?

Best Answer

First we need to get clear what you mean by these words. Note that when you say that the "mean throughput" is 1GB / 0.2s = 5 GB/s, this is not the mean of the throughputs in the individual measurements. For example, if you measured two transfer times for 1GB, one 0.1s and one 0.3s, the mean transfer time for 1GB is 0.2s, the throughputs in the individual measurements are 10 GB/s and about 3.3 GB/s, and the mean of those two throughputs is about 6.6 GB/s, not 5 GB/s.

What you quoted is an estimate for the expected throughput as the sample time and the data size go to infinity. That may well be the quantity you're actually interested in, but that quantity doesn't have a standard deviation. In the limit as the sample time and the data size go to infinity, the standard deviation of the throughput goes to zero, so there's no meaningful notion of standard deviation in this limit.

Thus, "the standard deviation of the throughput" only makes sense for a specified sample time or data size. Presumably you mean the standard deviation of the throughputs for 1GB of data, i.e. simply the standard deviation of the throughputs in your individual measurements. But if this is what you want, it's important to understand that this is not the standard deviation corresponding to the quantity you call "the mean throughput"; it's the standard deviation corresponding to the quantity which in my above example would have been about 6.6GB/s.

If this is what you're after, you're out of luck and you'll have to remeasure -- not only the standard deviation of the individual throughputs can't be reconstructed from the mean and standard deviation of the sample times; not even the mean of the individual throughputs (e.g. 6.6GB/s) can be reconstructed from those values.