Solved – Is it appropriate to do a t-test with z-scores

t-testz-score

I would like to test for significance of some sample data. However, I only have the z-scores of individual observations. Is it appropriate to make a claim about the sample as a whole by doing a t-test from the individual z-scores? Are there any assumptions to testing significance if z-scores are being used instead of raw data?

Based on the comments, it seems like a t-test is not the statistical test that is needed, but rather a z-test. I thought it was strange as I read a z-test does not depend on the size of the sample. So how could I make a claim about the sample as a whole without considering the sample size (n). After further reading, I think I may be able to evaluate the the null hypothesis that the sample data is not different than the population data using the below method:

Z = mean(individual z-scores) * sqrt(n)

Then use this Z in the z-table to evaluate the corresponding p-value.

I realize the typical way would be to use the raw data, but I cannot evaluate the raw data, and must use the individual z-scores.

Best Answer

If you can assume independence and approximate normality of observed values around their own population curve, you could construct an asymptotic chi-squared statistic to test for any form of deviation from the WHO charts by summing the squared $Z$ values. Since the parameters are all determined outside the sample, the chi-squared statistic should have $n$ degrees of freedom. [However rejection could as easily imply an under-estimate of the variation about the model as an issue with the location of the curve.]

If you want a test for a directional shift (an overall tendency to be larger or an overall tendency to be smaller) you could instead sum the Z values and (under the same assumptions) compare with a $N(0,n)$

[It would also be possible to test general deviations by testing the set of $Z$ values by goodness-of-fit testing for a standard normal but that would - naturally - tend to be much more directly sensitive to the assumption of normality. I wouldn't advise this approach.]

If you instead assume symmetry and independence, you could test directional shift with a sign test and consistency more broadly with a runs test.

Note that if your curves are obtained by following individuals through time, the assumption of independence might not be tenable; more suitable models would be needed for that.


Combining males and females even though they would have different growth curves
-- this should be possible for the chi-square test and the sum-of-Z tests I mentioned. With the sign test I mentioned it should also work.

The runs test would not, however -- you couldn't just jam the two series together, you'd need to combine the two test statistics. If the series are large enough to use the normal approximation you could add the two; the hypothesized mean of the number of runs should be the sum of the component means and the variance should be the sum of the variances.

Related Question