Solved – Log-linear transformation

data transformationregressionskewnessstata

I have transformed my variables using the ln function in Stata in order to solve some issues relating to the assumptions of the linear regression model. Whilst most issues were resolved this way (and this transformation helps out significantly in this), the data seems to be negatively skewed, resulting in a significant IM test as shown below.

Cameron & Trivedi's decomposition of IM-test

---------------------------------------------------
              Source |       chi2     df      p
---------------------+-----------------------------
  Heteroskedasticity |       8.42      7    0.2968
            Skewness |      17.92      3    0.0005
            Kurtosis |       0.51      1    0.4735
---------------------+-----------------------------
               Total |      26.86     11    0.0048

I have previously tried to use mboxcox to find appropriate transformations (my data contains zeros and had to add 1), and I do not find any appropriate transformation apart from the the second and third root for the variables – which is not desirable due to difficulties in interpretation and complications which arise.

Should I be bothered about this skewness issue? Skewness is approx -0.7.

Best Answer

1) OLS regression assumes that the residuals are normally distributed, not that the variables are.

2) Skew of 0.7 is pretty minor

3) The idea of "adding 1" to take the log has problems. Why 1? Why not .1? Or .01? Or, for that matter, 100? These would give different results, perhaps substantially so.

4) Often, transformation isn't the right solution to such problems. You could try robust regression, for instance.

Finally, why don't you tell us more about the details of your problem so that we can help more? What are your DV and IVs? What is your N? What are your hypotheses? What are you trying to find out?

Related Question