There are, in fact, two different formulas for standard deviation here: The population standard deviation $\sigma$ and the sample standard deviation $s$.
If $x_1, x_2, \ldots, x_N$ denote all $N$ values from a population, then the (population) standard deviation is
$$\sigma = \sqrt{\frac{1}{N} \sum_{i=1}^N (x_i - \mu)^2},$$
where $\mu$ is the mean of the population.
If $x_1, x_2, \ldots, x_N$ denote $N$ values from a sample, however, then the (sample) standard deviation is
$$s = \sqrt{\frac{1}{N-1} \sum_{i=1}^N (x_i - \bar{x})^2},$$
where $\bar{x}$ is the mean of the sample.
The reason for the change in formula with the sample is this: When you're calculating $s$ you are normally using $s^2$ (the sample variance) to estimate $\sigma^2$ (the population variance). The problem, though, is that if you don't know $\sigma$ you generally don't know the population mean $\mu$, either, and so you have to use $\bar{x}$ in the place in the formula where you normally would use $\mu$. Doing so introduces a slight bias into the calculation: Since $\bar{x}$ is calculated from the sample, the values of $x_i$ are on average closer to $\bar{x}$ than they would be to $\mu$, and so the sum of squares $\sum_{i=1}^N (x_i - \bar{x})^2$ turns out to be smaller on average than $\sum_{i=1}^N (x_i - \mu)^2$. It just so happens that that bias can be corrected by dividing by $N-1$ instead of $N$. (Proving this is a standard exercise in an advanced undergraduate or beginning graduate course in statistical theory.) The technical term here is that $s^2$ (because of the division by $N-1$) is an unbiased estimator of $\sigma^2$.
Another way to think about it is that with a sample you have $N$ independent pieces of information. However, since $\bar{x}$ is the average of those $N$ pieces, if you know $x_1 - \bar{x}, x_2 - \bar{x}, \ldots, x_{N-1} - \bar{x}$, you can figure out what $x_N - \bar{x}$ is. So when you're squaring and adding up the residuals $x_i - \bar{x}$, there are really only $N-1$ independent pieces of information there. So in that sense perhaps dividing by $N-1$ rather than $N$ makes sense. The technical term here is that there are $N-1$ degrees of freedom in the residuals $x_i - \bar{x}$.
For more information, see Wikipedia's article on the sample standard deviation.
First we need to get clear what you mean by these words. Note that when you say that the "mean throughput" is 1GB / 0.2s = 5 GB/s, this is not the mean of the throughputs in the individual measurements. For example, if you measured two transfer times for 1GB, one 0.1s and one 0.3s, the mean transfer time for 1GB is 0.2s, the throughputs in the individual measurements are 10 GB/s and about 3.3 GB/s, and the mean of those two throughputs is about 6.6 GB/s, not 5 GB/s.
What you quoted is an estimate for the expected throughput as the sample time and the data size go to infinity. That may well be the quantity you're actually interested in, but that quantity doesn't have a standard deviation. In the limit as the sample time and the data size go to infinity, the standard deviation of the throughput goes to zero, so there's no meaningful notion of standard deviation in this limit.
Thus, "the standard deviation of the throughput" only makes sense for a specified sample time or data size. Presumably you mean the standard deviation of the throughputs for 1GB of data, i.e. simply the standard deviation of the throughputs in your individual measurements. But if this is what you want, it's important to understand that this is not the standard deviation corresponding to the quantity you call "the mean throughput"; it's the standard deviation corresponding to the quantity which in my above example would have been about 6.6GB/s.
If this is what you're after, you're out of luck and you'll have to remeasure -- not only the standard deviation of the individual throughputs can't be reconstructed from the mean and standard deviation of the sample times; not even the mean of the individual throughputs (e.g. 6.6GB/s) can be reconstructed from those values.
Best Answer
Note that standard deviation measures "squared differences" from the mean. Let's write down the squared differences for your data sets
Set 1: $40^2, 30^2, 0^2, 30^2, 40^2$,
Set 2: $40^2, 20^2, 0^2, 20^2, 40^2$.
As the mean of the first data set is larger, it has the larger mean squared deviation.
Or, with out calculating: As the data in the first set in mean are more far from the mean, it has the larger standard deviation.