Can I calculate the standard deviation of averages from the population in groups knowing the standard deviation

statistics

From a sample population lets say 1000 numbers, I know the standard deviation SD.
I take the size of the groups n=100 and fill the groups randomly with those 1000 numbers.
I find out the average of those groups: a0,a1,…,a10

I want to find out (or approximate) a new standard deviation taking now as a sample the averages of those groups of 100 numbers: a0,a1,…,a10

The data I know at the start is the standard deviation of the whole population of 1000 numbers, and the size 100 of those groups. I need to find out the new standard deviation of the sample a0,a1,…,a10 without knowing ai values

I have tried dividing the standard deviation of the full population by sqrt(n) and I get close, is there a more precises way?

Best Answer

I think you are doing it correctly by dividing the sample standard deviation by $\sqrt{n}$, however, there is a reason why the number is not exactly what you expect.

As the Central Limit Theorem says, if you have a population, independently of its distribution (so, not necessarily normal), the average of a sample will be distributed as a normal distribution. That means that if you sample an infinite number of 100 groups and compute the mean of each sample, these mean values will follow a normal distribution. The standard deviation of the mean estimate (that is, the standard error) will be indeed the average divided by $\sqrt{n}$.

However, if you don't get closer to the expected number, it may be a sampling issue. In that case, your means (and standard deviations) are almost normal, but not quite, and its computation is subject to error. This is precisely what Student's t-distribution models, so that's a way of narrow down your estimates with confidence intervals. Or, increase the number of samples.

Related Question