ANOVA Models Parametrizations – Different Parametrizations in ANOVA Models and Their Results

anovagroup-differencesparameterizationrregression

In class our teacher explained that there are different parametrization which can be used to make the design matrix called

  • CornerPoint parametrization: the first coefficient represents the mean value of the response for the first subgroup and all the other coefficients the difference in the mean value for the other subgroups in respect to the reference subgroup;
  • GroupPoint parametrization: the coefficients represent the mean value of the response for each subgroup.

Then we were shown this example in R:

ctl <- c(417,558,518,611,450,461,517,453,533,514)
trt <- c(481,417,441,359,587,383,603,489,432,469)
group <- gl(2, 10, 20, labels = c("Ctl","Trt"))
weight <- c(ctl, trt)

lm.cornerPP <- lm(weight ~ group)
summary(lm.cornerPP)

lm.groupP <- lm(weight ~ group - 1) # omitting intercept
summary(lm.groupP)

Strange things happen in the two summaries :

------------------------------------------------------
|Coefficients CornerP:                               |
|                                                    |
|            Estimate Std. Error t value Pr(>|t|)    |
|(Intercept)   503.20      22.02  22.850 9.55e-15 ***|
|groupTrt      -37.10      31.14  -1.191    0.249    |  
------------------------------------------------------
|Coefficients GroupP:                                |
|                                                    |
|            Estimate Std. Error t value Pr(>|t|)    |
|groupCtl      503.20      22.02   22.85 9.55e-15 ***|
|groupTrt      466.10      22.02   21.16 3.62e-14 ***| 
------------------------------------------------------

As you can see the second coefficient of the CornerPoint parametrization is not significant, while in the GroupPoint parametrization the coefficients are both significant! I would expect that this was reflected in the CornerPoint parametrization makingall significant coefficient also there, but that is not the case. Why is that?

Best Answer

You have to be careful about the interpretation of the coefficients which will change with the parameterization.

We first specify the One-Way ANOVA model:

$$y_{ij} = \mu + \alpha_{j} + \epsilon_{ij}$$

where

  • $i=1, \dots, n$ is the number of observations
  • $j=1, 2$ are the number of groups (control vs treatment)
  • $\epsilon_{ij} \sim N(0, \sigma^2)$

Since this model is not full-rank (results in non-linearly independent columns of the design matrix), we have to restrict this somehow to get unique solutions for the coefficients. The parameterizations you noted are two ways of doing this.

Using restriction $\alpha_{1}=0$

This is the default restriction for one-way ANOVA in R. This is equivalent to the CornerPoint parametrization. In this case then $\alpha_{2}$ is the effect of the treatment ($j=2$) relative to the mean of group 1 (the intercept).

Then the t-test: $$H_0: \alpha_{2}=0 \text{ and } H_1: \alpha_{2} \ne 0$$

It is testing if there is ANY effect from the treatment. In other words, is the difference in the mean between the two groups significant?

Using restriction $\mu=0$

This is what you call the GroupPoint parametrization. In this case $\alpha_1$ and $\alpha_2$ are means of each group.

Then the t-test is $$H_0: \alpha_{j}=0 \text{ and } H_1: \alpha_{j} \ne 0$$

This is testing if the mean of group $j$ is zero.

This is testing something very different than the first t-test.

Related Question