Solved – How to handle ordinal categorical variable as independent variable

logisticordinal-datapredictorregression

I am using a logit model. My dependent variable is binary. However I have an independent variable which is categorical and contains the responses: 1.very good, 2.good, 3.average, 4.poor and 5.very poor. So, it is ordinal ("quantitative categorical"). I am not sure how to handle this in the model. I am using gretl.

[Note from @ttnphns: Although the question says the model is logit (because the dependent is categorical), the crucial issue – ordinal independent variables – is basically alike, be the dependent categorical or quantitative. Therefore the question is equally relevant to, say, linear regression too – as it is to logistic regression or other logit model.]

Best Answer

The problem with ordinal independent variable is that since, by definition, the true metric intervals between its levels are not known, no appropriate type relationship - apart from umbrella "monotonic" - can be assumed apriori. We have to do something about it, for example - to "screen or to combine variants" or to "prefer what maximizes something".

If you insist on treating your likert rating IV as ordinal (rather than interval or nominal) I've got a pair of alternatives for you.

  1. Use polynomial contrasts I.e. each such predictor used in the model enters not only linearly but also quadratically and cubically. So, not only linear, but more general, monotonic effect can be captured (the linear effect corresponds to the predictor kept as scale/interval and the other two effects tastes it as having nonqual intervals). Additionally, dummies of each predictor could be entered as well, which will test for the nominal/factorial effect. In the end of all that, you know how much your predictor acts as factor, how much as linear covariate, and how much as nonlinear covariate. This option is easy to do in almost any regression (linear, logistic, other generalized-linear models). It will consume dfs, so the sample size should be large enough.
  2. Use optimal scaling regression. This approach transforms monotonically an ordinal predictor into an interval one so as to maximize linear effect on the predictand. CATREG (categorical regression) is an implementation of this idea in SPSS. One problem of your specific case is that you want to do logistic, not linear regression but CATREG is not logit model based. I think this obstacle is relatively minor since your predictand is only 2-category (binary): I mean you might still do CATREG for optimal scaling, then do final logistic regression with the optained transformed scale predictors.
  3. Note also that in simple case of one scale or ordinal DV and one ordinal IV Jonckheere-Terpstra test might be a reasonable analysis instead of regression.

There could be other suggestions, too. The three above are what come to my mind just instantly reading your question.

Let me recommend you also to visit these threads: Associating between nominal and scale or ordinal; Associating between ordinal and scale. They could be helpful despite that they are not about specifially regressions.

But these threads are about regressions, particularly logistic: you must look inside: one, two, three, four, five.