[Math] Standard Error of the Sampling Distribution

statistics

Let $P =\{2,5,7,10\}$ be a population of size $N = 4$, and let take $n = 2$ be the size of those samples taken from $P$. Thus there are $6$ samples of size $2$ from $P$: They are $S_1 = \{2,5\}, S_2 = \{2,7\}, S_3 = \{2,10\}, S_4 = \{5,7\}, S_5 = \{5,10\}, S_6 = \{7,10\}$. Taking the sample means of those $6$ samples and call them $\bar{x_1}, \bar{x_2}, \bar{x_3},\bar{x_4},\bar{x_5},\bar{x_6}$. We have $\bar{x_1} = 3.5, \bar{x_2} = 4.5, \bar{x_3} = 6, \bar{x_4} = 6, \bar{x_5} = 7.5, \bar{x_6} = 8.5$. We then calculate the population standard deviation of the original population and obtain $\sigma = 2.915475$, and the standard error $\sigma_{\bar{x}} = 1.683251$. With $n = 2$, we have:
$\dfrac{\sigma}{\sqrt{n}} = \dfrac{2.915475}{\sqrt{2}}= 2.061552$. Clearly with this example we have: $\dfrac{\sigma}{\sqrt{n}} = 2.061552 \neq 1.683251 = \sigma_{\bar{x}}$. This contradicts with the formula in most statistics textbooks which states that : $\sigma_{\bar{x}} = \dfrac{\sigma}{\sqrt{n}}$. What went wrong ?. In my calculation, I used the formulas for the population standard deviation and not the sample standard deviation. Hope someone clears me up.

Edit: I found my mistake. The formula is true as there are a total of $16$ samples obtained by sampling with replacement.

Best Answer

The standard error of the mean, what you call $\sigma_{\bar x}$, is a function of a sample drawn from the population.

Suppose we model the population as a discrete random variable $X$ with probability mass function $$\Pr[X = 2] = \Pr[X = 5] = \Pr[X = 7] = \Pr[X = 10] = \frac{1}{4}.$$ Then $\mu = \operatorname{E}[X] = 6$ and $\sigma^2 = \operatorname{Var}[X] = \frac{17}{2}$ as we would expect. Then a sample is a set of IID random variables $$(X_1, X_2, \ldots, X_n)$$ drawn from this distribution, and the sample mean and the variance of the sample mean are $$\bar X = \frac{1}{n} \sum_{i=1}^n X_i, \quad \operatorname{Var}[\bar X] \overset{\text{iid}}{=} \frac{\operatorname{Var}[X]}{n} = \frac{\sigma^2}{n}.$$ Thus the standard error of the mean is $$SEM = \sqrt{\operatorname{Var}[\bar X]} = \frac{\sigma}{\sqrt{n}}.$$ None of these formulas relies on any particular distributional assumption, only that the population mean and variance are finite.

How this applies to the above distribution is, for instance, we draw a sample of size $n = 2$; e.g., $(X_1, X_2)$ is our sample and this yields $$\operatorname{Var}\left[\frac{X_1 + X_2}{2}\right] = \frac{17}{4} \implies SEM = \frac{\sqrt{17}}{2}.$$ Such a sample is taken with replacement; the joint distribution is a $4 \times 4$ table of ordered pairs taken from $P$; e.g., $$(2,2), (2,5), (2,7), (2,10), \\ (5,2), (5,5), (5,7), (5,10) \\ (7,2), (7,5), (7,7), (7,10) \\ (10,2), (10,5), (10,7), (10,10),$$ and each such outcome has probability $1/16$. In general, a sample of size $n$ would have a joint distribution on the set of ordered $n$-tuples whose elements are drawn from $P$.