Solved – Binary logistic regression with multiple independent variables

binary datalogisticregressionseparation

I have a group of 196 patients. I want to know if infection (the outcome, or dependent variable) depends on other variables. I am running a binary logistic regression with 8 independent variables (age, gender, type of surgery—6 different types, type of fixation, type of antibiotics). The categorical variables are automatically put into dummies by SPSS.

Some of my categorical variables have low frequencies (<5).

Can I run a binary logistic regression? Are the results reliable?

I have no categories with 0 patients, only some with only 1 or 2 patients. So I ran the regression and SPSS gives me the output above. Can I say that TRTCD2 and QSORRES are statistically significant? And that the p value or 1 or almost 1 are due to the small frequencies in this group?

enter image description here

Best Answer

At the heart of binary logistic regression is the estimation of the probability of an event. As detailed in RMS Notes 10.2.3 the minimum sample size needed just to estimate the intercept in a logistic model is 96 and that still results in a not great margin of error of +/- 0.1 in the estimated (constant) probability of event. If you had a single binary predictor the minimum sample size is 96 per each of the levels. So your sample size is insufficient for the task at hand. Not that p-values do not help this situation in any way.