So, using the clarifications in the comments to the question above. The different subsamples come from interviewing a random sample of people, sampled independently from different locations, which are themselves a sample from the total population. So this could be called clustered sampling.
A simple model could be: $X_{i1}, \dotsc, X_{in} \sim \text{N}(\mu_i, \sigma^2)$, $i=1,2, \dotsc, N$. Here the subpopulation means $\mu_i$ are themselves sampled from a normal distribution $\text{N}(\mu_G, \sigma^2_G)$. The normal distribution assumptions is not essential here, they simplify analysis, but the main points of the analysis can be repeated with some other distributional assumptions.
Note that here $\sigma^2$ is the variance of the individual observations within a subsample (assumed here to be the same value for all the subsamples, an assumption we could do away with). The other variance parameter, $\sigma^2_G$, is the variance of the subsample means. If that is zero, all the subpopulations have the same distribution, and we could really "treat the 10 subsamples like one large sample", otherwise we should not.
The structure is the same as in the famous schools example from the bugs documentation, see http://www.openbugs.net/Examples/Schools.html
We will estimate the global population mean $\mu_G$ with the average of the subgroup means, that is,
$$
\hat{\mu}_G = \frac1{N} \sum_{i=1}^N \frac1{n}\sum_{j=1}^n x_{ij}
$$
Under the above assumptions this will be unbiased, and now I will calculate its variance.
$$
\DeclareMathOperator{\E}{\mathbb{E}}\DeclareMathOperator{\V}{\mathbb{V}}
\V \hat{\mu}_G = \V \frac1{nN}\sum_{i=1}^N \sum_{j=1}^n x_{ij} \\
= (\frac1{N})^2 \left(\E \V (\sum_{i=1}^N \bar{x}_i \mid \mu_i ) +
(\V \E \sum_{i=1}^N \bar{x}_i \mid \mu_i) \right) \\
= \dotso = \frac1{N}(\frac{\sigma^2}{n} + \sigma_G^2)
$$
where we have used the total variance theorem https://en.wikipedia.org/wiki/Law_of_total_variance and the outer operator is over the distribution of the subgroup means $\mu_i$. In fact, we did'nt use the normal distribution assumption at all, only the expectations and variances.
Now, to get an estimate of the standard error of the global mean, you can replace with estimators of $\sigma^2$ and $\sigma^2_G$.
To find those estimators, we can write the usual ANOVA decomposition (another old name of this model is ANOVA type II):
$$
\sum_i \sum_j (x_{ij} - \bar{x})^2 = \sum_i \sum_j \left( ( x_{ij}-\bar{x}_i)+(\bar{x}_i - \bar{x})\right)^2 \\
= \sum_i\sum_j\left( (x_{ij}-\bar{x}_i)^2 + (\bar{x}_i-\bar{x})^2 \right) \\
= \sum_i \sum_j (x_{ij} -\bar{x}_i)^2 + n \sum_i (\bar{x}_i -\bar{x})^2
$$
(where we have used that the cross term sum to zero). From this we can read off the unbiased variance estimators as
$$
\hat{\sigma}^2 = \frac{\sum_i \sum_j (x_{ij} -\bar{x}_i)^2}{N(n-1)} \\
\hat{\sigma}^2_G = \frac{\sum_i (\bar{x}_i -\bar{x})^2}{N-1}
$$
The modern way to calculate this, for instance in R, is to see it as a mixed effects model and use the package lme4
.
Well, typically, we only have a single realization of the sample mean to work with - we are not often in a situation where we have several datasets drawn from the same population. Even if we did have multiple samples, why not just combine them to get an even more precise estimate of the sample mean, given that we likely care more about estimating the quantity itself?
However, resampling techniques use this approach to estimate standard errors. The most familiar is likely the bootstrap procedure, which can be used to estimate the standard error of any statistic. If you are unfamiliar, bootstrapping returns many resampled sample statistics, which we take the standard deviation of to estimate the standard error of that statistic.
Best Answer
This is a problem very often encountered in biology where they do a couple of independent experiments (100 in your case) sampled IID, each with their unknown own mean. The only thing they can do is estimate those means by again IID sampling. Typically, the variable $X_i$ is estimated by a sample of size $n_i$, so the variance will be $\sigma_i^2 = \sigma^2/n_i$, where $\sigma^2$ is the sampling variance, not the variance of $X$. Because each individual experiment can be written in the form $X_i + \varepsilon_{ij}$, the variance of that variable is $V + \sigma^2/n_i$, where $V$ is the variance of $X$.
You can compute the grand mean as $\bar{X} = \frac{1}{n}\sum_{i=1}^{100}n_i\bar{X_i}$, (where $n = \sum_{i=1}^{100}n_i$) which is a sum of the independent variables $\frac{n_i}{n}\bar{X_i}$. Their variance is $\frac{n_i^2}{n^2}V+\frac{n_i}{n^2}\sigma^2$, so by summing you get $\sum_{i=1}^{100}\frac{n_i^2}{n^2}V+\sigma^2/n$.