I measure the weight of 100 people and pick 20 lots of 5 people from these 100. Therefore n = 5, repeated 20 times. The 100 people represent the population and groups of 5 people represent samples from the population.
The mean weights (kg) of my 20 samples of n = 5 are 76, 71, 81, 75, 77, 78, 91, 80, 70, 78, 72, 78, 75, 75, 79, 70, 80, 74, 75, 75.
In R, I can combine these mean weights into a vector called x
using this code:
x <- c(76, 71, 81, 75, 77, 78, 91, 80, 70, 78, 72, 78, 75, 75, 79, 70, 80, 74, 75, 75)
If I wanted to calculate the standard error of the mean weights, would I simply do the code below?
sd(x)
When we repeatedly sample from a population and calculate the standard deviation of the sampling distribution of the mean, should the standard error still be refered to as the standard error or just the standard deviation of means?
Best Answer
I think the confusion here is partly due to the looseness in the way we typically talk about these topics (myself included). There isn't really any such thing as "the standard error". There is a standard error of the mean, and a standard error of the variance, etc., but not a standard error per se.
The standard deviation of a set of means drawn from the same population is an estimate of the standard error of the mean. The standard deviation of a sample, divided by $\sqrt N$, is also an estimate of the standard error of the mean. This was Fisher's original insight leading to the ANOVA test (see my answer here: How does the standard error work?).
Two more notes: