Ordinal Logistic Regression with Likert Scales – Best Practices

interpretationlogisticregression

I'm currently have a bit of difficulty determining how to analyze this data via logistic regression analysis.

 - Q18 = DV (satisfaction score ranging from 1-10)
 - Q10_1 = IV (Customer Service likert score from 1-5)
 - Q10_2 = IV (Sales likert score from 1-5)
 - Q10_3 = IV (Performance likert score from 1-5)
 - Q10_4 = IV (price likert score from 1-5) 
 - Q10_5 = IV (proposal likret score from 1-5)
 - Q10_6 = IV (collateral likert score from 1-5)
 - Q10_7 = IV (reporting likert score from 1-5)
 - Q10_8 = IV (manager likert score from 1-5)

My guess is that you need to use an ordered logistic regression model but i'm not sure what to factor in my formula just the DV or everything? Which equation is correct here?

nps.olr <- polr(data = cs_aggmean, formula = factor(Q18) ~ Q10_1 + Q10_2 + Q10_3 + Q10_4 + Q10_5 + Q10_6 + Q10_7 + Q10_8)

nps.olr <- polr(data = cs_aggmean, formula = factor(Q18) ~ factor(Q10_1) + factor(Q10_2) + factor(Q10_3) + factor(Q10_4) + factor(Q10_5) + factor(Q10_6) + factor(Q10_7) + factor(Q10_8))

After that I'm have trouble interpreting the results. For the first model the odds ratio I believe is this after I exponentiate the coefficients:

exp(nps.olr$coefficients)
Q10_1 = 1.834354
Q10_2 = 1.354964
Q10_3 = 3.259454
Q10_4 = 1.269431
Q10_5 = 1.326062
Q10_6 = 1.432196
Q10_7 = 1.424732
Q10_8 = 1.010827

I appreciate any guidance here and of course just let me know if I need to supply more information! I should mention that I'm using R for software and that I'm less interested in making a predictive model and more in making recommendations on how to increase satisfaction from these variables.

Best Answer

The coefficients obtained from an ordinal logistic regression model are called proportional odds ratios; you interpret them just like the coefficients from binary logistic regression models.

In your case, I assume that the data are taken from a customer survey. The exponentiated coefficient value of 1.83 for Q10_1 means that one point increase in Q10_1 is associated with an 83% (i.e., 1.83 times) increase in the odds of a customer rating the DV one point higher, with all other predictors held constant. The same interpretation goes for the other variables.

It is hard to judge which variables are important just based on the coefficient values, but just eyeballing them, Q10_3 appears to be the "most important" predictor, assuming it is also significant (i.e., its 95% confidence interval does not include 1). That is, higher performance (I suppose perceived by customers) is associated with higher satisfaction. You should test if this is really the case using the step function or the varImp function in the caret package, although I'm not sure if the latter supports polr.

Also you should check the proportional odds assumption as well. See this webpage for detailed information.

Related Solutions

Singular Information Matrix Error in LRM.fit in R – Solutions

Creating dummy variables should not be necessary. You should just use factors when modeling in R.

admityear <- factor(admityear) 
m4 <- lrm(Outcome ~ relGPA + mcAvgGPA + Interview_Z + WorkHistory_years + 
                       GMAT + UGI_Gourman + admityear, data=fsd)

If the singular condition still persists, then you have multicollinearity and need to try dropping other variables. (I would be suspicious of WorkHistory_years.) I also don't see anything ordinal about that model. Ordinal logistic regression in the rms package (or the no longer actively supported Design package) is done with polr(). And it would be really helpful to see the results from str(fasd).

Solved – Ordinal regression: proportional odds assumption

First off, are your two independent variables being adjusted as factors or numerically coded responses and is there an interaction term for the two? The reason I ask is because the test of proportional odds grows very sensitive with small cell counts. For this reason, I often find it justifiable to adjust input variables as their ordinally coded values (1: poor, 2: fair-to-poor, etc.). Doing so allows information to be borrowed across groups, proportionality is assessed so that an associated difference in the odds of a more favorable response comparing units differing by 1 in the predictor are consistent with odds of an even more favorable response (the rough and contrived interpretation of the test of proportional odds).

If your numeric coding still fails to give valid proportionality, it is possible to get consistent cumulative odds ratios estimates by collapsing adjacent categories like the two bottom box responses.

Thirdly, another powered test of association between an ordinal response and two ordinal factors is a plain old linear regression model. Using robust standard errors, you get valid confidence intervals despite the distribution of the errors. This tends to be less powerful that categorical methods, but with fewer pitfalls due to zero cell counts.

Lastly, as a comment, robust standard errors allow consistent estimation of the mean model in most circumstances. I'm not sure if these are implemented in SPSS, but R and SAS use these frequently. As with the proportional hazards assumption in the Cox model, when this "model based assumption check" fails, it does not mean the model results are entirely invalid, it's just that the effect estimates are "averaged" over their inconsistent proportionality. For instance, if proportional odds model has excessive numbers of respondents giving top box responses, and a predictor shows a large association for the top box response but smaller association for other cumulative measures, then you'll find that the cumulative odds ratio is a weighted combination of the several thresholded odds ratios, with a higher weight placed upon the top box OR.

Best Answer

Related Solutions

Singular Information Matrix Error in LRM.fit in R – Solutions

Solved – Ordinal regression: proportional odds assumption

Related Question