Solved – Whether to transform non-normal independent variables in logistic regression

data transformationlogisticnormal distribution

I recently received the following email, which I paraphrase below:

I want to do binomial logistic
regression with the data and I have non-normally distributed IVs. I tried doing square root transformation on
the non-normal IVs. This was
successful in normalising the distribution; however, when I did the
logistic regression on the data I got ridiculous results i.e. OR: 14,
CI 6 – 180. So I transformed the variables back by squaring them, and
then ran the analysis again.

Questions

  • Why do the odds ratios look ridiculous after applying square root transformation?
  • Should one apply the square root transformation to non-normal predictors when doing logistic regression?

Best Answer

Why odds ratios look strange on transformed variables

Transformations change the metric of the variable. Odds ratios are the predicted difference in odds for a one unit increase on the IV holding all other IVs constant. The meaning of one unit will be very different after a square root transformation.

For example, if you had a 1 to 100 raw scale, then after transformation, the difference between 16 and 25 on the raw scale would be the same as the difference between 4 and 5 on the square root transformed scale. Thus, it's not surprising that your odds ratios became a lot larger after square root transformation.

If you want to examine the effect of the transformation in a scaling-neutral way, you could standardise your IVs (i.e., make them z-scores). Thus, you could compare the odds ratio of a z-score of the raw variable to a z-score of the transformed variable. This will allow you to isolate the effect of changing the relative distance between categories.

Whether to transform non-normal predictors in logistic regression

Normality of predictors is not an assumption of logistic regression, or linear regression for that matter. See @whuber's answer here for more details.

That said, you may find one scaling of your IVs more predictive or interpretable. I'd use criteria like that to decide whether you want to transform a predictor variable.

Related Question