Solved – Finding Standard error of the sample mean by making multiple samples behave like a single one

meanpopulationsamplestandard deviationstandard error

I know that the standard error of the sample mean estimates how good my sample mean is, compared to the population mean.

Ok that's nice and good but let's say I have 10 indepentent Samples and in each I have 100 Observations.

With this data I can compute the mean of all my 10 Samples means and this would be a good estimate of my population mean.

But how good is this new estimate when compared to the true population mean?

Should I treat my 10 Samples as a single one and use again this formula to find out?

$\large SE = \frac{s}{\sqrt n}$ where $n=10*100$

and $\large s=\sqrt\frac{\sum^{10}_i (\bar x_i – \mu_\bar x)^2}{10}$ where $\large \mu_\bar x = \frac{\sum^{10}_i\bar x_i}{10}$ (mean of 10 sample means)

and of course $\bar x_i$ is the mean of a single sample.

EDIT :

How are the samples collected ?

Assume I collect 1 sample by asking 100 different people about their time spent on social media websites per day. I repeat this process 10 days in a row, everytime in different locations, hence I've got 10 samples each with 100 observations.

Best Answer

So, using the clarifications in the comments to the question above. The different subsamples come from interviewing a random sample of people, sampled independently from different locations, which are themselves a sample from the total population. So this could be called clustered sampling.

A simple model could be: $X_{i1}, \dotsc, X_{in} \sim \text{N}(\mu_i, \sigma^2)$, $i=1,2, \dotsc, N$. Here the subpopulation means $\mu_i$ are themselves sampled from a normal distribution $\text{N}(\mu_G, \sigma^2_G)$. The normal distribution assumptions is not essential here, they simplify analysis, but the main points of the analysis can be repeated with some other distributional assumptions.

Note that here $\sigma^2$ is the variance of the individual observations within a subsample (assumed here to be the same value for all the subsamples, an assumption we could do away with). The other variance parameter, $\sigma^2_G$, is the variance of the subsample means. If that is zero, all the subpopulations have the same distribution, and we could really "treat the 10 subsamples like one large sample", otherwise we should not.

The structure is the same as in the famous schools example from the bugs documentation, see http://www.openbugs.net/Examples/Schools.html

We will estimate the global population mean $\mu_G$ with the average of the subgroup means, that is, $$ \hat{\mu}_G = \frac1{N} \sum_{i=1}^N \frac1{n}\sum_{j=1}^n x_{ij} $$ Under the above assumptions this will be unbiased, and now I will calculate its variance. $$ \DeclareMathOperator{\E}{\mathbb{E}}\DeclareMathOperator{\V}{\mathbb{V}} \V \hat{\mu}_G = \V \frac1{nN}\sum_{i=1}^N \sum_{j=1}^n x_{ij} \\ = (\frac1{N})^2 \left(\E \V (\sum_{i=1}^N \bar{x}_i \mid \mu_i ) + (\V \E \sum_{i=1}^N \bar{x}_i \mid \mu_i) \right) \\ = \dotso = \frac1{N}(\frac{\sigma^2}{n} + \sigma_G^2) $$ where we have used the total variance theorem https://en.wikipedia.org/wiki/Law_of_total_variance and the outer operator is over the distribution of the subgroup means $\mu_i$. In fact, we did'nt use the normal distribution assumption at all, only the expectations and variances.

Now, to get an estimate of the standard error of the global mean, you can replace with estimators of $\sigma^2$ and $\sigma^2_G$.

To find those estimators, we can write the usual ANOVA decomposition (another old name of this model is ANOVA type II): $$ \sum_i \sum_j (x_{ij} - \bar{x})^2 = \sum_i \sum_j \left( ( x_{ij}-\bar{x}_i)+(\bar{x}_i - \bar{x})\right)^2 \\ = \sum_i\sum_j\left( (x_{ij}-\bar{x}_i)^2 + (\bar{x}_i-\bar{x})^2 \right) \\ = \sum_i \sum_j (x_{ij} -\bar{x}_i)^2 + n \sum_i (\bar{x}_i -\bar{x})^2 $$ (where we have used that the cross term sum to zero). From this we can read off the unbiased variance estimators as $$ \hat{\sigma}^2 = \frac{\sum_i \sum_j (x_{ij} -\bar{x}_i)^2}{N(n-1)} \\ \hat{\sigma}^2_G = \frac{\sum_i (\bar{x}_i -\bar{x})^2}{N-1} $$ The modern way to calculate this, for instance in R, is to see it as a mixed effects model and use the package lme4.