ANOVA – Why Use Sum of Squares of Interaction as Sum of Squares Error in Randomized Complete Block Design

anovablockingexperiment-designmathematical-statisticsrandom allocation

Take for example the RCB design [Y = I + B + F + E], where…

  • Y is the response
  • I is the overall intercept
  • B is a blocking factor with two levels
  • F is a treatment factor with two levels
  • E represents the error residual

Here is an example Data Set:

Block Level / Factor Level / Y Data

B.1 / F.1 / 1

B.1 / F.2 / 2

B.2 / F.1 / 3

B.2 / F.1 / 5

==============

Since this is a 'block' design, there are no replicates in the Block x Factor level combinations, which means that no Block x Factor interaction term is allowed in the model, because there would be no remaining error residuals for the SS-error (and thus no MS-error or F-test).

However, having no replicates also means that the interaction sum of squares is now taken as the SS-error residuals instead of the 'standard' within-group residuals.

In other words,

  • in a Factor x Factor (two-factor) CRD design, the SS-error is 'Y minus Factor1 x Factor2 level average, squared'. And the SS-interaction is 'F1xF2 average – F1 average – F2 average + Intercept, squared'

  • in a Block x Factor RCB design, the SS-error is taken as 'BxF average – B average – F average + Intercept, squared' which is identical the the interaction SS formula from the two-factor design.

=====================

Outside of the obvious reason for doing this, namely that there is simply no within-group error residuals in the B x F level combinations due to no replicates, is there any theoretical/philosophical reason why it is acceptable to use an interaction SS as an error SS?

Is is always the case that the biggest interaction term can be used as an error sum of squares whenever there are no replicates in the factor level combinations?

Best Answer

Note that this is a Randomized Block Design, and the justification is in the word Randomized. If we have formally the same layout, but randomization was not done, then we cannot just assume that the interaction term represents the experimental error.

is there any theoretical/philosophical reason why it is acceptable to use an interaction SS as an error SS?

Randomization.

Is is always the case that the biggest interaction term can be used as an error sum of squares whenever there are no replicates in the factor level combinations?

No. Without randomization that must be justified in some other way. But there are helpful ideas like Tukey's 1df for interaction. See What's the difference between a randomized block design and two factor design? for comparison with a two-factor design.

Related Question