Solved – Is it better to remove higher order interactions or least significant terms first in model simplification

interactionmixed modelmodel selection

I have a mixed effects model with 3 explanatory factors and a full interaction set (including 3 way interaction). This is the full model. Factor 1 is time and I am interested in the change in the response variable over time. Therefore, only factors or interactions where time is present are biologically relevant.

My questions are:
1) Is it better to begin model simplification by removing the highest order (3-way) interaction first, or by removing the least significant term first (which in this case is factor 2)? The 3 way interaction does actually have a relevant interpretation. I have seen both options proposed.

2)Is it okay to include interactions but remove their basic factors (IE include time x treatment when treatment alone is removed from the model)? Several of the basic factors make no sense when not interacting with time. IE measuring the effect of treatment when you are excluding time doesn't make much sense since starting conditions inside each treatment level are identical.

If my hypothesis is correct the final model should include several factor interactions but should not include basic factor 2 or 3 (as they make no sense in the absence of the time factor).

Thank you for your help. I am happy to clarify if my question is ambiguous.

Best Answer

Except in very unusual cases, you should not remove an effect and leave-in any effects that contain it. So if you find that the $AB$ interaction is significant, then don't throw out the main effects of $A$ or $B$. Similarly, don't keep $ABC$ and throw out $AB$, or $B$, or any other main effect or two-way interaction involving those three factors.

So, I recommend reading the ANOVA table from the bottom up. You can throw out stuff at the bottom to simplify the model, and work upward, keeping this effect hierarchy in mind.

To see, why, consider a simple illustration, where $A$ and $B$ both have two levels. Let $\mu_{ij}$ denote the expected response at the $i$th level of $A$ and the $j$th level of $B$. If you think that $A$ and $B$ interact, you are saying that the $A$ effects depend on $B$ and vice versa; and this is essentially the same as saying the the combinations of $A$ and $B$ should be regarded as levels of one factor (having 4 levels in this example). As such, the effects of this 4-level factor are quantified as contrasts $c_{11}\mu_{11}+c_{12}\mu_{12} + c_{21}\mu_{21} + c_{22}\mu_{22}$, where the $c_{ij}$ sum to zero. But if you exclude $A$ from the model, you are saying that the $A$ effect, $\{(\mu_{11}+\mu_{12}) - (\mu_{21}+\mu_{22})\}/2$ is equal to zero---thus limiting the contrasts you are willing to consider. Put another way, the four-level factor determined by the levels of $A$ and $B$ together has 3 degrees of freedom, but the interaction effect $AB$ has only 1 d.f.; the other two d.f. are covered by the $A$ and the $B$ main effects.