Solved – Justification for default contr.poly() polynomial contrasts in R

contrastsmatrixr

In R, one can use contr.poly() to create a set of orthogonal contrast codes for testing linear and higher order effects of categorical factors:

> print(codes <- contr.poly(3))
             .L      .Q
[1,] -7.071e-01  0.4082
[2,] -7.850e-17 -0.8165
[3,]  7.071e-01  0.4082

Here, .L represents linear effects and .Q quadratic. However, these codes make the interpretation of parameter estimates tricky compared to a set of codes like:

> matrix(c(-.5, 0, .5, -.333, .666, -.333), ncol = 2)
     [,1]  [,2]
[1,] -0.5 -0.333
[2,]  0.0  0.666
[3,]  0.5 -0.333

I'm wondering what the justification for the defaults of contr.poly is? I've noticed that if you premultiply the code matrix by its transpose you get the identity but that's as far as I've gotten.

> t(codes) %*% codes
          .L        .Q
.L 1.000e+00 5.551e-17
.Q 5.551e-17 1.000e+00

Best Answer

The columns of the orthogonal contrast matrix are scaled so that they each of a norm of 1. Such a matrix is said to be orthonormal. If this matrix is $X$ and you compute $X^TX$, the diagonal elements are the squared norms of the columns of $X$, which is why you get an identity matrix when you do the computation you mentioned.

As for why the designer of contr.poly() decided to produce the codes in this way, my guess is just because it is kind of elegant, and it ultimately doesn't matter what the scales of the contrasts are anyway. I don't think it is for any considerations of interpretational ease.

Related Question