In a 2 x 2 design, it is fairly easy to run a bootstrap test of the interaction. Let define the four conditions as a, b, c and d. The conditions $a$ and $b$ are the question factor for the treatment and $c$ and $d$ are the question factor for the control condition. The mean interaction contrast (MIC) is given by
$ (a + d) - (b + c).$
It quantifies the amount of non-additivity in the dataset. If MIC is zero, it means that there is a main effect of questions (there is an increase--or decrease-- from Q1 to Q2), a main effect of conditions (there is an increase --or decrease-- from control to treatment) and no interaction. If such is the case, mean in $b$ is a few points above mean in $a$, and mean in $c$ is also the same amount of points above the mean in $d$. As of treatment, the same occur (treatment measures are a few point above control measures). Defining the first increment as $d_1$ and the second as $d_2$, the means are thus
$
\left(\begin{matrix}M_a \;\;\; M_b \\ M_c\;\;\;M_d \end{matrix}\right) =
\left(\begin{matrix}M_a \;\;\;M_a+d_1 \\ M_c \;\;\; M_c+d_1 \end{matrix}\right) =
\left(\begin{matrix}M_a \;\;\; M_a+d_1 \\ M_a+d_2\;\;\;M_a+d_2+d1 \end{matrix}\right)
$
so that
$MIC = (M_a+(M_a+d_1+d_1)) - ((M_a+d_1)+(M_a+d_2)) = 0$.
Thus, to do a boostrap estimate, sub-samples in the groups with replacement, and compute MIC. Repeat this a very large number of times (say 5,000). Finally, find the range in which 95% of the MIC found are located. If this interval includes 0, then the interaction is not significantly different from zero.
This reasoning works for a fully between group design. In a mixed design, you have to select pairs of scores randomly before computing MIC (preserving subjects' two measures).
You do not need to run 24 separate normality tests since the assumption to be met in ANOVA is the normality of the residuals of the model and not the normality of the dependent variables in combinations of all factors levels.
This misunderstanding comes from the fact that the necessary normality of residuals is derived from the normality of the dependent variables in all factor combinations however this is a more strict assumption since the opposite is true only with the additional assumption of homogeneity among all factor combinations. It is also worth noting that SPSS does not offer the choice to save residuals in the simple 1-way ANOVA (Analyze > Compare Means > ANOVA) since they probably assume that when one factor is present, normality tests for the dependent in all levels of the factor is an easy thing to do (with Explore procedure for example).
So, you should compute and save the residuals and after check for normality with just one test of your choice.
Beware that the residuals are possible to be computed only after the model has been created so you should find how to save that residuals in a separate variable. (In case of SPSS, in the main ANOVA dialog of your choice, press Save button, in case of R residuals are computed and saved in the output of every lm command)
Hope this helps you.
Best Answer
One of the standard techniques in such situations are due to Brunner and Langer [1]. These non-parametric mixed-effects models can deal with multiple within-subject factors and some between subject factor. In some fields (e.g. dental medicine), there are very popular. In R, they are implemented in the
nparLD
package.[1] Brunner E, Domhof S, Langer F (2002). Nonparametric Analysis of longitudinal Data in Factorial Experiments. John Wiley & Sons, New York.