Solved – Categorical vs Continuous Variable

categorical datacorrelationlogisticregression

I have two variables, one continuous and one categorical which I currently both use as predictor variables in a logistic regression model.

Their relationship is shown in the following plot with the x axis showing the different categories of the categorical variable and the y axis showing the values of the continuous predictor variable.

enter image description here

Being quite new to statistics in general, I wonder if one can make general statements as to whether one should only use the categorical, only the continuous variable or whether in some situations it still makes sense to use both variables even if they are clearly correlated as in the plot above.

Best Answer

As it became clear in the comments that the categorical variable just was a binning of the continuous one, the answer is clear: Only use the continuous variable as a predictor.

But beware: assuming that a continuous variable acts linearly in logistic regression, is a strong assumption, so model it with a regression spline, see Logistic Regression with regression splines in R.