Solved – Why do we calculate pooled standard deviations by using variances

poolingstandard deviationt-testvariance

Why do we calculate the pooled standard deviation by averaging the variances and taking the square root, rather than averaging the standard deviations directly?

Edit: this came up in the context of creating an effect size for a paired samples t-test, but if the answer varies across contexts I am interested to learn about that as well.

Best Answer

We work with variances rather than standard deviations because variances have special properties.

In particular, variances of sums and differences of variables have a simple form, and if the variables are independent, the result is even simpler.

That is, if two variables are independent, the variance of the difference is the sum of the variances ("variances add" -- but standard deviations don't).

Specifically, in say a two-sample t test, we're trying to find the standard deviation of the difference in sample means. We can use basic properties of variance (linked above) to see that the variance of the individual sample means is $\sigma^2/n$, which we can estimate by $s^2/n$ for each sample.

Now that we have the variance of each the means, we can use the "variances add" result to get that the variance of the difference of the means is the sum of the two variances of the sample means. So the standard deviation of the distribution of the difference in means (the standard error of the difference in means) is the square root of that sum.

This works quite directly for the Welch t-test, where we estimate $\text{Var}(\bar{X}-\bar{Y})$ by $s_x^2/n_x+s_y^2/n_y$. The equal-variance version works using the same idea but because the variances are assumed identical, there we produce a single overall estimate of $\sigma^2$ from both samples. That is, we add together all the squared deviations from the corresponding group mean before dividing by the total d.f. from the two groups (each loses 1 d.f. because we measure deviations from the individual group means). This corresponds to a form of d.f.-weighted average of the individual variances $s^2_p=w_xs^2_x+w_ys^2_y$ where $w_x=\text{df}_x/(\text{df}_x+\text{df}_y)$. Then that single estimate of pooled variance $s^2_p$ is used in an estimate of the variance of the difference in means. Since $\text{Var}(\bar{X})=\sigma^2/n_x$ and $\text{Var}(\bar{Y})=\sigma^2/n_y$, again the variance of the sum is the sum of the variances, so $\text{Var}(\bar{X}-\bar{Y})=\sigma^2/n_x+\sigma^2/n_y$, which we again then estimate by replacing $\sigma^2$ by the estimate $s^2_p$.

In either case, we can standardize our difference in means by dividing by the corresponding estimate of standard error. In both cases this is where the denominator of the $t$-statistic comes from.

Similar results come up in other cases.