Study Design: I showed participants some information about sea-level rise, focusing the information in different ways, both in terms of the time-scale and the magnitude of potential rise. Thus I had a 2 (Time: 2050 or 2100) by 2 (Magnitude: Medium or High) design. There were also two control groups who received no information, only answering the questions for my DVs.
Questions:
I've always checked for normality within cells — for the 2×2 portion of this design, it would mean looking for normality within 4 groups. However, reading some discussions here has made me second guess my methods.
First, I've read that I should be looking at the normality of the residuals. How can I check for normality of residuals (in SPSS or elsewhere)? Do I have to do this for each of the 4 groups (6 including the controls)?
I also read that normality within groups implies normality of the residuals. Is this true? (Literature references?) Again, does this mean looking at each of the 4 cells separately?
In short, what steps would you take to determine whether your (2×2) data are not violating assumptions of normality?
References are always appreciated, even if just to point me in the right direction.
Best Answer
Most statistics packages have ways of saving residuals from your model. Using
GLM - UNIVARIATE
in SPSS you can save residuals. This will add a variable to your data file representing the residual for each observation.Once you have your residuals you can then examine them to see whether they are normally distributed, homoscedastic, and so on. For example, you could use a formal normality test on your residual variable or perhaps more appropriately, you could plot the residuals to check for any major departures from normality. If you want to examine homoscedasticity, you could get a plot that looked at the residuals by group.
For a basic between subjects factorial ANOVA, where homogeneity of variance holds, normality within cells means normality of residuals because your model in ANOVA is to predict group means. Thus, the residual is just the difference between group means and observed data.
Response to comments below: