Z-Score – Understanding Why the Absolute Mean Z-Score Value of Two Groups is Always the Same

I have 50 data points each with 7 variables.

              v1  v2  v3  v4  v5  v6  v7   
data point 1
data point 2
...
data point 50

These 50 data points are divided into two groups, with label a or b. Each group has 25 data points. I need to compute a final score using the 7 variables for each data point.

Since these 7 variables are not initially on the same scale, I standardize them all using z-score. So for each variable and using all 50 data points, I separately compute the mean and standard deviation and then compute the z-score. I then compute the final score in the following way:

score for data point 1 = v1 + v2 + (v3 - v4) - v5 - v6 - v7 (all v* are z-scores)

So here is my problem. After computing the scores for all data points, I compute the mean score for the two groups (I want to know the difference between the two groups). The two mean values are always exactly the same but in opposite directions. For example:

[group=a] mean of scores: 2.16
[group=b] mean of scores: -2.16

I'm pretty sure I'm missing something since it's unusual that the two groups always have the same absolute values in opposite directions. So I don't know why this happens or what I'm missing.

Best Answer

The summary variable has mean zero by construction, so the issue is what happens when you compare the two groups.

A short answer is that this isn't always true. It will be true if the number of observations is equal in two subsamples, which itself implies an even number of observations $n$.

Consider any standardized variable with mean zero. It follows that the sum is also zero. Now consider any split whatsoever into two groups. If the sum of standardized values over group 1 is $S$, then the sum over group 2 must be $-S$. If the groups are of equal size, then the means are also equal in absolute value, as the means are $S / (n / 2)$ and $-S / (n / 2)$ or $2S/n$ and $-2S/n$.

To see for yourself that this depends on equal group sizes, consider as a counter-example values 1 2 3 4 5 and any split whatsoever into subsamples.

Best Answer

Related Solutions

Solved – the correct method to calculate the Z score for the mean of a variable between two groups

Solved – How to find an appropriate standardization method for combining non-normally distributed variables

Related Question