ANOVA Models Parametrizations – Different Parametrizations in ANOVA Models and Their Results

anovagroup-differencesparameterizationrregression

In class our teacher explained that there are different parametrization which can be used to make the design matrix called

CornerPoint parametrization: the first coefficient represents the mean value of the response for the first subgroup and all the other coefficients the difference in the mean value for the other subgroups in respect to the reference subgroup;
GroupPoint parametrization: the coefficients represent the mean value of the response for each subgroup.

Then we were shown this example in R:

ctl <- c(417,558,518,611,450,461,517,453,533,514)
trt <- c(481,417,441,359,587,383,603,489,432,469)
group <- gl(2, 10, 20, labels = c("Ctl","Trt"))
weight <- c(ctl, trt)

lm.cornerPP <- lm(weight ~ group)
summary(lm.cornerPP)

lm.groupP <- lm(weight ~ group - 1) # omitting intercept
summary(lm.groupP)

Strange things happen in the two summaries :

------------------------------------------------------
|Coefficients CornerP:                               |
|                                                    |
|            Estimate Std. Error t value Pr(>|t|)    |
|(Intercept)   503.20      22.02  22.850 9.55e-15 ***|
|groupTrt      -37.10      31.14  -1.191    0.249    |  
------------------------------------------------------
|Coefficients GroupP:                                |
|                                                    |
|            Estimate Std. Error t value Pr(>|t|)    |
|groupCtl      503.20      22.02   22.85 9.55e-15 ***|
|groupTrt      466.10      22.02   21.16 3.62e-14 ***| 
------------------------------------------------------

As you can see the second coefficient of the CornerPoint parametrization is not significant, while in the GroupPoint parametrization the coefficients are both significant! I would expect that this was reflected in the CornerPoint parametrization makingall significant coefficient also there, but that is not the case. Why is that?

Best Answer

You have to be careful about the interpretation of the coefficients which will change with the parameterization.

We first specify the One-Way ANOVA model:

$$y_{ij} = \mu + \alpha_{j} + \epsilon_{ij}$$

where

$i=1, \dots, n$ is the number of observations
$j=1, 2$ are the number of groups (control vs treatment)
$\epsilon_{ij} \sim N(0, \sigma^2)$

Since this model is not full-rank (results in non-linearly independent columns of the design matrix), we have to restrict this somehow to get unique solutions for the coefficients. The parameterizations you noted are two ways of doing this.

Using restriction $\alpha_{1}=0$

This is the default restriction for one-way ANOVA in R. This is equivalent to the CornerPoint parametrization. In this case then $\alpha_{2}$ is the effect of the treatment ($j=2$) relative to the mean of group 1 (the intercept).

Then the t-test: $$H_0: \alpha_{2}=0 \text{ and } H_1: \alpha_{2} \ne 0$$

It is testing if there is ANY effect from the treatment. In other words, is the difference in the mean between the two groups significant?

Using restriction $\mu=0$

This is what you call the GroupPoint parametrization. In this case $\alpha_1$ and $\alpha_2$ are means of each group.

Then the t-test is $$H_0: \alpha_{j}=0 \text{ and } H_1: \alpha_{j} \ne 0$$

This is testing if the mean of group $j$ is zero.

This is testing something very different than the first t-test.

Related Solutions

Solved – How to compare nested factor levels to mean, not to first level, and how to test the last nested level in R lm()

The answer in short is "contrasts". I had heard of them before but hadn't learned what they are. In fact, they serve exactly the purpose I was looking for. They can be used to alter the coefficients. For a given model you would always have the same number of coefficients, but by using contrasts they can be changed to be different linear combinations of the default coefficients. That is, the model is reparameterized - same model, different coefficient meanings and values.

Also, additional comparisons of other "virtual parameters" (linear combinations of actual parameters), not compared in the model summary, can be made after the model is fit, using the multcomp package, although this is not necessarily simple. (The "contrasts" package may also do this, though I couldn't figure out how to use it).

The default contrast in R is "contr.treatment" and it leads to the coefficients in the problem statement, with levels being compared to the base level within that factor. Another contrast, "contr.sum", generates coefficients that compare each level to the mean of all the levels of that factor.

In my case, I want the outer factor to continue using the "contr.treatment" contrast because I do want the outer factor to compare to the control level, jA. For the inner factor, though, I want to use "contr.sum" so the k factor levels get compared to the mean of all k levels at that j level. This is done after creating the k factor, with the assignment statement: contrasts(k) = "contr.sum"

The full code now looks like this:

> j = gl(3, 30, labels=LETTERS[1:3])
> k = gl(3, 10, length=90, labels=LETTERS[13:15])
> k[1:30] = "M" # Reference j="A" has K="M" only
> contrasts(k, 3) = "contr.sum"
> x = 100*(2^as.numeric(k)) # 200,400,800 for k=M,N,O
> x[1:30] = mean(c(200,400,800)) # Reference j="A" has x=same mean as "B" and "C"
> x = x + rnorm(90, 0, 4) # A little randomness
> x = x + rep(c(0, 1, 1.2), each=30) # A slight non-significant deviation by j.
> df = data.frame(x=x, j=j, k=k)
> f = lm(x ~ j/k, data=df, drop.unused.levels=TRUE)
> summary(f)

Call:
lm(formula = x ~ j/k, data = df, drop.unused.levels = TRUE)

Residuals:
     Min       1Q   Median       3Q      Max 
-10.6628  -2.9090  -0.4446   2.9412  10.0984 

Coefficients: (2 not defined because of singularities)
             Estimate Std. Error  t value Pr(>|t|)    
(Intercept)  467.3444     0.7878  593.259   <2e-16 ***
jB             0.6631     1.1141    0.595    0.553    
jC             0.4530     1.1141    0.407    0.685    
jA:k1              NA         NA       NA       NA    
jB:k1       -266.9330     1.1141 -239.604   <2e-16 ***
jC:k1       -267.5542     1.1141 -240.162   <2e-16 ***
jA:k2              NA         NA       NA       NA    
jB:k2        -66.0202     1.1141  -59.261   <2e-16 ***
jC:k2        -65.7784     1.1141  -59.044   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 4.315 on 83 degrees of freedom
Multiple R-squared:  0.9996,    Adjusted R-squared:  0.9996 
F-statistic: 3.344e+04 on 6 and 83 DF,  p-value: < 2.2e-16

The coefficients have been renamed with "k1" and "k2" instead of "kM" and "kN" for some reason, but the results are now what is expected, as verified here:

> mean(df$x[df$j == "A"])
[1] 467.3444
> mean(df$x[df$j == "B"])-mean(df$x[df$j == "A"])
[1] 0.6631321
> mean(df$x[df$j == "C"])-mean(df$x[df$j == "A"])
[1] 0.4530243
> mean(df$x[df$j == "B" & df$k == "M"])-mean(df$x[df$j == "B"])
    [1] -266.933
    > mean(df$x[df$j == "C" & df$k == "M"])-mean(df$x[df$j == "C"])
[1] -267.5542
> mean(df$x[df$j == "B" & df$k == "N"])-mean(df$x[df$j == "B"])
    [1] -66.02023
    > mean(df$x[df$j == "C" & df$k == "N"])-mean(df$x[df$j == "C"])
[1] -65.77839

The comparisons of jB:kO and jC:kO (aka jB:k3 and jC:k3) are still missing. These do not appear as coefficients because they are simple linear combinations of the existing coefficients. In fact, with the "contr.sum" contrast, the value these coefficients would have is jB:k3 = -(jB:k1+jB:k2) and jC:k3 = -(jC:k1+jC:k2), as can be verified with a little algebra, remembering that coefficient jB is the mean of level jB minus the intercept, and jB:k1 is the mean of level kM minus the mean of level jB.

To use that knowledge to do the other two tests that I want, the glht() function in the multcomp package can be used. However, it chokes on the NA coefficient values that resulted from level jA containing only jB:M, requiring that we modify the model matrix to remove the columns corresponding to the NA coefficients, as well as the intercept column:

> # With contr.sum contrasts, the jB:k2 term is the negative of jB:k1+jB:k2
> summary(glht(f, linfct="0 - jB:k1 - jB:k2 = 0"))
Error in summary(glht(f, linfct = "0 - jB:k1 - jB:k2 = 0")) : 
  error in evaluating the argument 'object' in selecting a method for function 'summary': Error in modelparm.default(model, ...) : 
  dimensions of coefficients and covariance matrix don't match
>
> # It fails because of NA coefficients.  They can be removed by removing their
> # columns in the model matrix and then updating the model.
> M = model.matrix(f)
> # Save the "assign" and "contrasts" attributes of the model matrix.
> attr.assign = attr(M, "assign")
> attr.contrasts = attr(M, "contrasts")
> rmvCols = c(1, which(summary(f)$aliased))
> M = M[,-rmvCols] # Remove intercept column (required) and NA columns
> # Reassign the "assign" attribute, removing the corresponding elements from it.
> attr(M, "assign") = attr.assign[-rmvCols]
> # Reassign the "contrasts" attribute to its original value
> attr(M, "contrasts") = attr.contrasts

Now we update model with new model matrix and try glht() again.

> f1 = update(f, ~M)
> f1

Call:
lm(formula = x ~ M, data = df, drop.unused.levels = TRUE)

Coefficients:
(Intercept)          MjB          MjC       MjB:k1       MjC:k1       MjB:k2       MjC:k2  
   466.9725       0.2974      -0.4174    -266.9183    -267.7272     -65.2277     -66.5216  

> # Now try glht using updated model fit.  Must prepend an "M" on the coefficient
> # names, which was added because model matrix was named M.
> summary(glht(f1, linfct="0 - MjB:k1 - MjB:k2 = 0"))

     Simultaneous Tests for General Linear Hypotheses

Fit: lm(formula = x ~ M, data = df, drop.unused.levels = TRUE)

Linear Hypotheses:
                         Estimate Std. Error t value Pr(>|t|)    
0 - MjB:k1 - MjB:k2 == 0  332.953      1.114   298.9   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Adjusted p values reported -- single-step method)

Excellent! This is one of the two missing comparisons I wanted!!!

To confirm estimate correctness:

> coef(f)
 (Intercept)           jB           jC        jA:k1        jB:k1        jC:k1        jA:k2        jB:k2        jC:k2 
 467.3444162    0.6631321    0.4530243           NA -266.9329982 -267.5542422           NA  -66.0202315  -65.7783877 
> -(coef(f)[5] + coef(f)[8])
   jB:k1 
332.9532

Yes, it is correct!

Do both missing comparisons, jB:k3 and jC:k3, at the same time.

> summary(glht(f1, linfct=c("0 - MjB:k1 - MjB:k2 = 0", "0 - MjC:k1 - MjC:k2 = 0")))

     Simultaneous Tests for General Linear Hypotheses

Fit: lm(formula = x ~ M, data = df, drop.unused.levels = TRUE)

Linear Hypotheses:
                         Estimate Std. Error t value Pr(>|t|)    
0 - MjB:k1 - MjB:k2 == 0  332.953      1.114   298.9   <1e-10 ***
0 - MjC:k1 - MjC:k2 == 0  333.333      1.114   299.2   <1e-10 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Adjusted p values reported -- single-step method)

Sweet!

Solved – Interpreting intercept in logistic regression when there is more than one categorical variable

So, you have two binary predictors, say $X$ and $Z$, and the logistic model $$\DeclareMathOperator{\P}{\mathbb{P}} \P(Y=1 \mid X=x) = \frac{e^{\beta_0 + \beta_1 x}}{1+e^{\beta_0 + \beta_1 x}} $$ setting $X=0$ you can see that $\beta_0$ is the log odds of probability that $Y=1$ given $X=0$. Now adding the second predictor in the model: $$ \P(Y=1 \mid X=x,Z=z) = \frac{e^{\beta_0 + \beta_1 x+\beta_2 z}}{1+e^{\beta_0 + \beta_1 x + \beta_2 z}} $$ and you can see that in this model $\beta_0$ is the log odds of the probability that $Y=1$ given both $X=0$ and $Z=0$ (in your code example you look at the conditioning $X=0, Z=1$).

Best Answer

Related Solutions

Solved – How to compare nested factor levels to mean, not to first level, and how to test the last nested level in R lm()

Solved – Interpreting intercept in logistic regression when there is more than one categorical variable

Related Question