Solved – Transforming data with positive, negative, and zero values

data transformationregression

I have a multiple linear regression model with several dependent variables that have positive, negative, and zero values, and are not normally distributed. I can't do a natural log transformation because of the 0 and negative values, can't square or cube it due to 0 values, and the Box-Cox transformation works only for positive and 0 values. Is there a transformation I can do that works for all of these? I've seen log(x+minimum value) as one option, but not so much here on this forum—is this a valid transformation?

Best Answer

Yes, you can add a constant and then take a logs.

There are many ways to transform data.

There is nothing inherently invalid about doing this, but very often such transformations are misguided. It is not necessary for the dependent variable to be normally distributed. The assumption about normality concerns the residuals, not the response variable itself. If the residuals are not plausibly normally distributed then of course some transformation may be warranted.

One major downside of such transformations is that it makes sensible model interpretation much more difficult.

Related Question