Solved – Should I treat these ordinal IVs as covariates or factors, in a regression

regressionspss

I have a survey problem where the dependent variable (ordinal) is in Likert-type scale (i.e. 1 to 5 from most satisfied to most dissatisfied) and two sets of independent variables. One set has 7 IVs (almost the same scale but 1-5 scale) and a set of 5 IVs with a scale of 1-6, both ordinal. See Which is applicable, ordinal or multinomial regression model?

Which is the best way to analyze this kind of problem ? Do I need to treat the IVs as factors or covariates?

Best Answer

The distinction between a “factor” and a “covariate” is related to the nature of the predictor/independent variable.

A factor is a nominal variable that can take a number of values or levels and each level is associated with a different mean response on the dependent variable. Even if the factor is coded using numbers, these numbers have no particular meaning. For example, it's perfectly possible for group ‘2’ to have a lower mean value on the dependent variable than group ‘1’ and ‘3’. Behind the scenes, in a regular ANOVA/linear model, the groups can be represented by a set of “dummy variables” with a different coefficient for each group.

Ideally, a covariate should be a continuous and interval-level measure but in any case the values have to be meaningful because the relationship between covariates and outcome/dependent variable is quantitative. A simple linear model will have a single coefficient to capture this relationship. Other models (models with interactions, polynomial regression, splines, etc.) add some complications but it should be meaningful to think about the magnitude of the covariate.

The notion that “factors” are essential and “covariates” can be left out stems from common study designs in psychology and some other fields. Typically, the main variable of interest will be manipulated by experimentally setting it to a handful of levels whereas demographic variables (age, personality, etc.) are simply measured on a more-or-less continuous scale. Consequently, the “factor” must definitely figure in the analysis but the “covariates” could possibly be ignored. The experimental design can also ensure that different factors are not correlated and the groups are balanced, which is not necessarily the case if you are merely observing/measuring variables.

Mathematically, however it does not make any difference whether you look at it as an ANCOVA, in which the continuous variables are called “covariates”, or a multiple linear regression, in which continuous variables are simply predictors (see When should one use multiple regression with dummy coding vs. ANCOVA?).

You can also very much design a study where the main manipulation is quantitative (imagine something like manipulating the temperature of a room) but ancillary measures are binary (say gender). You would probably not call the temperature a “covariate” but it certainly should not be used as a “factor” in an ANOVA or left out of the model. Whether a variable is “essential” or was experimentally manipulated will change the interpretation but not necessarily the way it figures in the model.

In your case, whether it is reasonable to treat multi-item Likert scales as interval measures could be debated and will also depend on the specifics of the data but it is certainly pretty standard. They are definitely not nominal.