Solved – Results of lm() function with a dependent ordered categorical variable

lmr

I am trying to understand Ben Bolker's answer to this question.

First, we create a data frame:

set.seed(101)
d <- data.frame(x=sample(1:4,size=30,replace=TRUE))
d$y <- rnorm(30,1+2*d$x,sd=0.01)

Then Mr. Bolker says:

x as ordered factor

coef(lm(y~ordered(x),d))
##  (Intercept) ordered(x).L ordered(x).Q ordered(x).C 
##  5.998121421  4.472505514  0.006109021 -0.003125958

Now the intercept specifies the value of y at the mean factor level (halfway between 2 and 3); the L (linear) parameter gives a measure of the linear trend (not quite sure I can explain the particular value …), Q and C specify quadratic and cubic terms (which are close to zero in this case because the pattern is linear); if there were more levels the higher-order contrasts would be numbered 5, 6, …

My question is, what does the regression formula look like explicitly?

I thought lm() makes a model like this:

y = 5.9981 + 4.4725 (x_1) + 0.0061 (x_2) – 0.00312 (x_3)

where, since the x_i are categories, they can only be either 0 or 1.

I do not understand what quadratic and cubic terms have to do with a linear model. Even so, squaring/cubing any of the variables would not make a difference, since 0 ^ 3= 0 and 1^3 = 1.

Best Answer

In this case, what lm() is doing is converting your "categorical" variable into a numeric sequence in order.

To make this clearer, I'll adapt Bolker's code a bit to make the X variable more obviously categorical:

set.seed(101)
d <- data.frame(x=sample(1:4,size=30,replace=TRUE))
d$y <- rnorm(30,1+2*d$x,sd=0.01)
d$x = factor(d$x, labels=c("none", "some", "more", "a lot"))
coef(lm(y~x, d))
#  (Intercept)       xsome       xmore      xa lot 
#     3.001627    1.991260    3.995619    5.999098

So here, the mean of x=None is in the intercept, and the deviation from that is indicated for each category.

coef(lm(y~ordered(x), d))
#  (Intercept) ordered(x).L ordered(x).Q ordered(x).C 
#  5.998121421  4.472505514  0.006109021 -0.003125958

Conceptually, what's happened here is that the ordered() function converted x into newx using (something similar to):

if (x=="None") newx=-.67
if (x=="some") newx=-.22
if (x=="more") newx=.22
if (x=="a lot") newx=.67

and then it fitted (something like) the model: $$y = a + b_0 \times newx + b_1 \times newx^2 + b_2 \times newx^3$$

where you have linear $newx$, quadratic $newx^2$, and cubic $newx^3$ components.

Note, I said that it's something like that, because the problem with the model described there is that $newx$, $newx^2$, and $newx^3$ are not at all independent. What lm() does instead is uses a set of contrasts generated by contr.poly(4). These contrasts ensure orthogonality, so that the linear, quadratic and cubic components are independent. But the principle is similar - when fitting ordered factors, lm() fits a linear, quadratic, cubic, etc... component.

You can see this by comparing:

coef(lm(y~ordered(x), d))
#  (Intercept) ordered(x).L ordered(x).Q ordered(x).C 
#  5.998121421  4.472505514  0.006109021 -0.003125958

with

contrasts(d$x) <- contr.poly(4)
coef(lm(y~x, d))
#  (Intercept) ordered(x).L ordered(x).Q ordered(x).C 
#  5.998121421  4.472505514  0.006109021 -0.003125958

Exactly identical. So if you want a fuller understanding of what happened, take a closer look at contr.poly() and orthogonal polynomial contrasts in general.

One thing to note is that there is an implicit assumption hidden in here: the difference between each two levels is assumed to be equal. So "None" is as far from "some" as "some" is from "more", and "more" is from "a lot".

Related Solutions

Solved – CLMM Output interpretation from R Ordinal

These aren't separate models for the linear and quadratic term for tsf; the results you show are for one single model containing both terms. What the results are suggesting is that you need to have the quadratic term rather than just represent/approximate tsf as a linear function only.

Solved – GLM interpretation of parameters of ordinal predictor variables

I'm using a slightly cooked example here.

data("kyphosis",package="HH")
kk <- subset(kyphosis,Number<=5 & Start>=13 & Start<=17)
kk <- transform(kk, Number=ordered(Number), Start=ordered(Start),
                kyph=as.numeric(Kyphosis)-1)

Unfortunately since I've cut the data set down so much there are only 2 instances of kyphosis presnt (out of 43 cases), so I have to use the brglm package to overcome complete separation

library(brglm)
m1 <- brglm(Kyphosis~Number+Start,family=binomial,data=kk)

## Coefficients:
##              Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -2.038967   0.581733  -3.505 0.000457 ***
## Number.L     0.221716   1.321268   0.168 0.866736    
## Number.Q     0.074500   1.286798   0.058 0.953832    
## Number.C     0.359261   1.227672   0.293 0.769801    
## Start.L     -0.813621   1.627716  -0.500 0.617178    
## Start.Q      0.295470   1.584274   0.187 0.852051    
## Start.C      1.231051   1.259250   0.978 0.328269    
## Start^4      0.007603   1.430817   0.005 0.995760

The test:

 m2 <- update(m1, . ~ . - Number)
 anova(m1,m2,test="Chisq")

This tests whether Number has a significant effect overall on the incidence of kyphosis. Determining whether Number significantly increases the incidence of kyphosis is tricky, because the higher-level/nonlinear contrasts (quadratic/Q, cubic/C, quartic/4, etc.) mean that the effect of increasing Number by 1 unit can depend on where you start. For example, if the quadratic term is large and (let's say, without much loss of generality) positive, then the effect of increasing Number will generally be to decrease the incidence when Number is small and to increase it when Number is large. I suppose that if the linear term is positive and significant and all the other terms are small, then you could say there is a positive effect of Number on kyphosis, but otherwise it will be difficult to say.

x as ordered factor

Best Answer

Related Solutions

Solved – CLMM Output interpretation from R Ordinal

Solved – GLM interpretation of parameters of ordinal predictor variables

Related Question