[Math] the standard deviation of the sample distribution for the sample standard deviation

statistical-inferencestatistics

I know that for the sample distribution for the sample mean given a large sample or a normal underlying distribution, the mean of the sample distribution is the population mean of the underlying population and the standard deviation of the sample distribution is the standard deviation of the underlying population divided by the square root of the sample size.

I also know that in general, the mean of a sample distribution for an unbiased estimator is the population parameter that is estimated.

Assuming that the sample size is large, what is the standard deviation of the sample distribution of a sample statistic, say, the sample standard deviation?

Best Answer

Your first paragraph is entirely correct. Your second paragraph as it stands is too vague to be correct or incorrect. It would be correct if it were re-phrased as follows:

“the mean of a sample distribution for an unbiased estimator is the population parameter that is estimated.”

Your third paragraph starts out saying this:

Assuming that the sample size is large, what is the standard deviation of the sample distribution of a sample statistic [?]

That depends on the population distribution and on which statistic it is. (Maybe I'll post a bit on that later.)

Then you go on:

say, the sample standard deviation?

Just exactly what a sample standard deviation is might be a question to examine. If the sample is $X_1,\ldots,X_n$, and the population is normally distributed, is it the maximum-likelihood estimator $\sqrt{\frac 1 n \sum_{i=1}^n (X_i-\bar X)^2}$ of the population standard deviation, where $\bar X = (X_1+\cdots+X_n)/n$, or is it square root of the unbiased estimator of the variance, $\sqrt{\frac 1 {n-1} \sum_{i=1}^n (X_i-\bar X)^2}$ (which is not an unbiased estimator of the population standard deviation) or something else? The topic of unbiased estimation of standard deviations is somewhat involved, and unbiasedness in statistical estimation is over-rated.

But if you ask about the variance, then something can be said. The statistic $$ S^2 = \frac 1 {n-1}\sum_{i=1}^n (X_i - \bar X)^2 $$ is an unbiased estimator of the variance of a normally distributed population, and it can be shown that $$ \frac{(n-1)S^2}{\sigma^2} \sim \chi^2_{n-1}. \tag 1 $$ This has variance $2(n-1)$; consequently we have $$ \operatorname{var}(S^2) = \frac{2\sigma^4}{n-1}. \tag 2 $$ I've posted a proof of $(1)$ on stackexchange a few times. Proving $(2)$ from scratch takes some work but it's elementary, just involving some integrations by parts and the like.