Solved – Make sense of contrast in general linear model (GLM)

generalized linear model

I understand how to make sense of the design matrix in a general linear model (GLM). Basically, each column of the design matrix describes one condition under which the data are observed.

For example, I wish to model 4 factors, i.e., diagnosis, age, gender, and weight. Then, I would have a design matrix as follows.

    /0 26 1 75\ - 1st subject: a 26 year-old and 75 kg female, no disease
    |2 13 0 60| - 2nd subject: a 13 year-old and 60 kg male, severe stage
    |1 12 1 77| - 3rd subject: a 12 year-old and 77 kg female, intermediate disease
    |   ...   |
    \         /

where

  1. diagnosis: 0 means normal, 1 means intermediate, and 2 means severe;
  2. age: 13 means 13 years old, and etc.;
  3. gender: 0 means male, and 1 means female;
  4. weight: 75 means 75 kg, and etc..

Now comes the question, is there a way to make sense of the contrast? Or equivalently, how could I design a contrast based on what I want? I believe only by making sense of it first can one design it.

Best Answer

Each column does not necessarily represent a condition of observation. Take for example age modeled as a quadratic effect - you would have a column for age and one for age^2. And note that you need a column of ones for the intercept. But to your point, you need to specify a quantity or quantities of interest before you can write down the contrast that provides that. Once you do that, the most general and most simple approach I've found is to think of a contrast as a difference in one or a series of predicted values. Once you specify all the conditions (covariate settings) that get you the predicted values you want, you can easily get differences in predicted values and standard errors for those differences (which come from differences in two design matrices). This is the approach in the R rms package contrast.rms function. See for example http://www.inside-r.org/packages/cran/rms/docs/contrast.rms.

Related Question