Solved – Constructing 3-way ANOVA when design is not fully factorial

anovacategorical datamissing datarregression

I have conducted an experiment measuring individual sizes as a function of two categorical variables (A and B), each with three levels (1, 2, 3). The combination A3:B3 is a control group. This experiment was conducted across two replicates on separate individuals (i.e. no repeated measures). However, design constraints meant that these replicates were performed across three "blocks", as shown in the table below.

experimental design

As you can see, the control group appears in all blocks, but the other treatments are split across them, so the design is not fully factorial. I would like to test whether there is an effect on size caused by each variable and their interaction, both relative to each other and the control across all blocks. However, I would also like to test if there is an effect of block within each group, e.g. are A1:B1 individuals different in size in block 1 compared to block 2? Visualising the data suggest that this could be the case.

Using ANOVA in R I have set up a 3-way linear model as follows:

model <- lm(Size ~ A*B*Block)
Anova(model, type="II")

However, for the interactions B:Block and A:B:Block this model returns zero degrees of freedom. I believe this is due to factor level combinations with zero observations, i.e. because not all combinations appear in all blocks. I'm also concerned that I do not explicitly want to test the effect of block as a stand-alone variable, again because not all treatment combinations appear in all blocks (and therefore I'd expect to see differences in sizes across the blocks).

I would like help on how best to construct this model, and also how to run appropriate post-hoc tests. Running TukeyHSD(x=model, conf.level=0.95) results in a large number of NA contrasts due to missing factor combinations. How can I better set up these tests so that only comparisons I am interested in are performed?

Best Answer

You are correct that since not all combination appear in your data-set you will not be able to test certain interactions and contrasts. I am afraid there is no way round this with the constraints you had.

Do you really want to add the three-way interaction to your model? It is up to you but higher-order interactions can be hard to interpret especially when you have missing cells in the design. I would have included the two-ways only. (A+B+Block)^2 should do it. Note that if you fit a model with just the main effects (A + B + Block) there is still one effect which cannot be estimated since A3 and B3 are aliased, they only ever occur together so cannot be separated. Moving on to including the two-ways we find more problems. There is only one estimable effect for the A B interaction because none of A2B3, A3B2 and A3B3 are separately estimable because of the way they co--occur. There are also problems with the interactions with Block

You should not try to remove the main effect of Block. Search for the concept of marginality for explanations of why or look at this Q&A Including the interaction but not the main effects in a model for some discussion.

Related Question