Solved – Running logistic regression on survey data

rregression

I have a survey data which has one dependent variable ("Overall experience") and several independent variables(Quality of food, creativity of menu etc.). The response scorecard for both dependent and independent variables are as follows:

Excellent 5
Very Good 4
Good 3
Fair 2
Poor 1

As per my understanding and from what I have read, I cannot run a simple linear regression and thus I have opted for logistic regression.
Can anyone guide me in the right direction regarding logistic regression using R.

Best Answer

Logistic regression would be a fine start. The basic function to do binary logistic regression in R is glm. See here for how to use it.

If performance seems unacceptably poor you can fit a logistic regression using functions of the original features in order to create a more flexible decision boundary (e.g. glm(formula = y ~ x + I(x^2), ...)). Make sure to evaluate performance on either a holdout set or via cross-validation so as to prevent overfitting if you take this route.

If you are not comfortable making this a binary response then you could try ordinal logistic regression. I believe that the ordinal package in R would be helpful.

As a final note, it is not impossible to use OLS linear regression to predict ordinal variables although other things like logistic regression will probably work far better.

Related Question