Solved – How to calculate the standard deviation and error for a difference between two different means

descriptive statisticsmeanstandard deviationstandard errorvariance

I have 40 people that I measure at baseline, getting their mean level of X at time zero. I also calculate the standard deviation and standard error of the mean of X.

Then after 100 days I measure their levels of X again, and again calculate a mean, standard deviation, and standard error. I lost five people to follow up, so the N for this group is only 35.

     Time    N    MeanX    SD    SE
        0   40      6.9   5.2   0.8
      100   35      5.7   5.7   1.0

I am interested in the difference in mean levels of X between 100 days and baseline. So I can calculate this easily, as

5.7 - 6.9 = -1.2

My question is… For this value, -1.2, I would also like to know its standard deviation and standard error. Could someone tell me how to do this? I've found a few possible formulas on the internet, including one from a similar question on this site, for example squaring the standard deviations, dividing them by their n's, and then taking the square root, but I am not one hundred percent sure if this is what I want.

Best Answer

Go back to first principles: the variance of a difference is the sum of the variances minus twice the covariance.

Here the variances would be the squares of the standard errors of the means. The covariance would be between means at time zero and means at 100 days over repeated instances of the same exercise (40 cases at time 0 and fewer cases selected from the same individuals at 100 days). Without information on individual cases and the process that led to the loss of cases, I don't see a way to determine the covariance.

If you assume zero covariance, the square root of the sums of the squares of the standard errors is 1.28 versus a difference of -1.2 between the means, so there would be no significant difference. As @dsaxton and @Glen_b point out, you would typically expect a positive covariance in this situation, which might diminish the variance of the difference, but you don't have the necessary data. And as @MarkL.Stone points out, a negative covariance can't be ruled out a priori, which would instead increase the variance of the difference.