Solved – How to perform a regression on non-normal data which remain non-normal when transformed

distributionsnonparametricregression

I've got some data (158 cases) which was derived from a Likert scale answer to 21 questionnaire items. I really want/need to perform a regression analysis to see which items on the questionnaire predict the response to an overall item (satisfaction). The responses are not normally distributed (according to K-S tests) and I've transformed it in every way I can think of (inverse, log, log10, sqrt, squared) and it stubbornly refuses to be normally distributed.
The residual plot looks all over the place so I believe it really isn't legitimate to do a linear regression and pretend it's behaving normally (it's also not a Poisson distribution). I think this is because the answers are very closely clustered (mean is 3.91, 95% CI 3.88 to 3.95).

So, I am thinking I either need a new way of transforming my data or need some sort of non-parametric regression but I don't know of any that I can do in SPSS.

Best Answer

You don't need to assume Normal distributions to do regression. Least squares regression is the BLUE estimator (Best Linear, Unbiased Estimator) regardless of the distributions. See the Gauss-Markov Theorem (e.g. wikipedia) A normal distribution is only used to show that the estimator is also the maximum likelihood estimator. It is a common misunderstanding that OLS somehow assumes normally distributed data. It does not. It is far more general.

Related Question