Solved – Calculating standard deviations for aggregated means

meansample-sizestandard deviationvariance

Example:
Say I have a normal distribution that contains responses from both male and female participants with an average score of 50 ± 2.

Lets then say I actually receive this information in the following way;
Men have an average score of x = 49 ± 3.
Women have an average score of y = 51 ± 4.
I can use this to calculate the combined mean of these values z = 50 ± 5 (by summing the errors in quadrature).

Question
How much information about the male and female distributions needs to be encoded in order to retain the original error or how do I obtain a more realistic error on Z based on the knowledge that these are essentially part of the same data. Mean, variance and sample size intuitively seem enough to accomplish this.

The trivial case for two normal distributions both with n entries and the same mean would scale like;
\begin{align}
{\frac{1}{\sqrt{2}}}
\end{align}

I suppose I am looking for the general form of this scaling for the error.

Best Answer

I found the answer to this question here; How to 'sum' a standard deviation?

I've simulated it robustly and it works out fine. One can take the variance of the mixed distribution as the average of the two component distributions and work out standard error from there.