Solved – Handling missing data in logistic regression

logisticmissing datarregressionregression-strategies

I'm trying to do logistic regression, but I can't seem to get the results I want. I have 6 columns of data (one dependent and 5 independent binary variables) and about 100 rows. The problem with my dataset is that I have a lot of missing ness in the data (NA's) which I think is the reason why I can't do the regression. Is there any way to tackle the situation? I think removing the rows with NA's in them not a good idea because ill have very less data left.

Best Answer

Without much more information we can't give you guaranteed advice here.

  1. You can remove rows of data. However, this will cause problems if they are not randomly missing. For instance, the fact that they are missing may indicate something about them (such as they are not an engaged customer).
  2. You can impute values if you have a means to do so.
  3. You can remove columns of data with missing values.
  4. You can bin your data. Example: Answer1, Answer2, MissingValue.
  5. Other.
  6. You can determine that you do not have enough data in the sample to adequately represent the population you are trying to estimate and you can go get more data.
Related Question