[Math] The standard deviation is more stable than the mean

meansnormal distributionprobabilitystandard deviationstatistics

In an introduction to the subject of hypothesis testing, a book on probability and statistics for engineering students has a statement asserting that "the standard deviation is more stable than the mean' (paraphrased). The context is paraphrased as follows:

A machine packs packages of powered sugar. If the machine works properly, then the weight of each package is normally distributed with a mean 0.5kg and a standard deviation 0.015kg. One day, a worker randomly selected 9 bags of powered sugar packed by the machine, with the following weights:

0.497kg, 0.506kg, 0.518kg, 0.524kg, 0.498kg, 0.511kg, 0.520kg, 0.515kg, 0.512kg

Use this to decide whether the machine was working properly.

In order the derive a solution, the books does the following (still paraphrased):

Let the random variable $X$ denote the weight of a package of powered sugar on this particular day, and let $\mu$ and $\sigma$ denote the mean and the standard deviation of $X$ respectively. Experience suggests that the standard deviation is more stable than the mean. So we may suppose that $\sigma=0.015$. Thus $X\sim \mathcal{N}(\mu, 0.015^2)$, with $\mu$ unknown. With this in mind, we propose two hypotheses:
$$H_0: \mu = 0.5, \qquad H_1: \mu\neq 0.5.$$
We want to use the data available to decide which one to accept.

The book then proceeds to introduce the statistic
$$ \frac{\bar{X}-\mu}{\sigma/\sqrt{n}}$$
to do the hypothesis testing.

The questionable assumption about the standard deviation being equal to 0.015 aside, I want to know whether it is true that 'experience suggests that the standard deviation is more stable than the mean'. And if this is true, do we have a theoretical explanation?

I was thinking that perhaps we could interpret it this way: Denote the weight of a package produced by the machine when it is working properly by the random variable $Y$, then $Y\sim \mathcal{N}(\lambda, \theta^2)$, for some $\lambda$ and $\theta>0$. Then taking all possible factors into account, maybe we could view the weight of a package produced by the machine in each possible state (broken or not) as a new variable $Y_i\sim \mathcal{N}(\lambda, \theta^2)$, with all the $Y_i$ independent and indentically distributed. Then the statement is saying that for an $n$ large enough
$$ D(\bar{Y}) \geq D(S), \tag{1}$$
where $\bar{Y}$ is the sample mean $\bar{Y}=\frac{1}{n}\sum_{i=1}^n Y_i$, and $S$ is the sample standard deviation, with $S^2=\frac{1}{n-1}\sum_{i=1}^n (Y_i-\bar{Y})^2$. But it is not very difficult to see that Inequality $(1)$ is not always true. Failing to find any reference about this statement, I ask for your help!

Best Answer

You should interpret the statement Experience suggests that the standard deviation is more stable than the mean as the assertion that even when the machine is not working properly, the standard deviation of a package weight remains unchanged. Judging from the context, it's a statement about the population $\sigma$, not the sample standard deviation. So the author is trying to justify the continued use of $\sigma = 0.015$ no matter what the value of $\mu$ is. It seems unnecessary to even make this remark (since the hypothesis test is conducted under the assumption that the machine is working properly) unless there is a later exercise in computing the power of the test, where you need to consider the distribution of the test statistic when $\mu\ne0.5$.

If the statement is interpreted as a assertion about the stability of the sample standard deviation compared to that of the sample mean, you can show that for the normal distribution, asymptotically the sample SD has half the variance of the sample mean. For an IID sample from any distribution the variance of the sample mean is $$ \frac{\sigma^2}n.\tag1$$ Less commonly known is the variance of the sample standard deviation. For the normal distribution it turns out to be $$ \sigma^2\left[ 1 - \frac2{n-1}\left(\frac{\Gamma(n/2)}{\Gamma(\frac{n-1}2)}\right)^2\right],\tag2 $$ which behaves like $\sigma^2\over 2n$ as $n\to\infty$ [see Section 3 of this paper].

When $n=9$ formula (1) evaluates to $0.11\sigma^2$ while (2) is approximately $0.06\sigma^2$. Nonetheless, this fact should not be interpreted as justification for using the sample SD in place of the population SD, and notice that the example does not do so.

Related Question