Solved – Computing standard error of mean derived from multiple samples

meanmeta-analysispoolingstandard error

I've had trouble finding a clear answer elsewhere on the internet and thought I'd put it to the XV community.

Problem Description

Suppose I have $N$ samples, each on a different subject. Each sample involves of $n_i$ measurements on a single subject, with each measurement yielding a value $x_j$. Each sample has a mean $\mu_i$ and standard deviation $\sigma_i$. I wish to combine the means into a single mean $\mu_T$ and test if $\mu_T$ is statistically different from a particular value $y$.

I have access to all $x_j$, but don't want to compute mean and standard error of all $x_j$, as different subjects may be more or less reliable.

To compute the combined mean, I am using the equation $\mu_T = \frac{\sum_i^N n_i \mu_i}{\sum_i^N n_i}$.

Question 1:

What is the correct formula for the standard error of $\mu_T$? I've thought about using the pooled standard error,
\begin{equation}
\sigma_{ErrT} = \sqrt{\frac{\sum_i^N (n_i-1)\sigma_i^2}{\sum_i^N (n_i-1)}\sum_i^n \frac{1}{n_i}}
\end{equation}
but is this an appropriate estimate of the standard error of $\mu_T$?

Question 2:

Once the standard error of $\mu_T$ is found, is it appropriate to use a t-test to compare if $\mu_T$ and $y$ are statistically different? What would be the d.o.f. of the test statistic?

Best Answer

Possibly the simplest way to accomplish what you want is to set up the problem as a linear model, where each of your observations is the difference $(x_{ij} - y)$ and the individuals are taken as independent variables, maybe better as a random effect. You then test whether the intercept of the linear model is different from zero, which is usually a direct output from a standard statistical program. This would essentially take into account, in a reasonably standard way, all of your issues about numbers of observations, differences among individuals, etc., and might provide useful estimates of the differences among individuals, particularly if the variances may be the same among individuals.

To answer the question as you posed it, I've rewritten your formula with hats for the estimated values so that we don't confuse them with population values.

$$\hat\mu_T = \frac{\sum_i^N n_i \hat\mu_i}{\sum_i^N n_i}.$$

If the observations are independent, then the basic properties of variances give the variance of your estimate $\hat\mu_T$ as:

$$\frac{\sum_i^N n_i^2 Var(\hat\mu_i)}{(\sum_i^N n_i)^2} = \frac{\sum_i^N n_i^2 (\frac{s_i^2}{n_i})}{(\sum_i^N n_i)^2} = \frac{\sum_i^N n_i s_i^2}{(\sum_i^N n_i)^2}.$$

This is based on the formula for the square of the standard error of the mean $$Var(\hat\mu_i) = \frac{s_i^2}{n_i},$$

where $s_i^2$ is the unbiased sample estimate of the variance. The square root $$\sqrt{\frac{\sum_i^N n_i s_i^2}{(\sum_i^N n_i)^2}}=\frac{\sqrt{\sum_i^N n_i s_i^2}}{\sum_i^N n_i}$$

gives you an estimate of the standard error of $\hat\mu_T$. This seems a bit different from your formula. I think that, out of the total number of observations, you have used up one degree of freedom for each of the $\hat\mu_i$ when doing it this way, which is somewhat analogous to treating the individuals as fixed effects in a linear model. But I get confused sometimes when I try to reason through degrees of freedom, which is one of many reasons why I try to use well-vetted statistical routines whenever possible.

Related Question