Solved – Interpret t-values when not assuming normal distribution of the error term

linear modelregression

Assume that you have a regression with a whole set of variables and you know that the residuals are not normal distributed. So you just estimate a regression using OLS to find the best linear fit. For this you disclaim the assumption of normal distributed error terms. After the estimation you have 2 "significant" coefficients. But how can anyone interpret these coefficients? So there is no way to say: "These coefficients are significant", although the Hypothesis $\beta=0$ can be declined with a high t-statistic (because of disclaiming normal error assumption). But what to do in this case? How would you argue?

Best Answer

If the residuals are not normal (and note that this applies to the theoretical residuals rather than the observed residuals), but not overly skewed or with outliers then the Central Limit Theorem applies and the inference on the slopes (t-tests, confidence intervals) will be approximately correct. The quality of the approximation depends on the sample size and the degree and type of non-normality in the residuals.

The CLT works fine for the inference on the slopes, but does not apply to prediction intervals for new data.

If your not happy with the CLT argument (small sample sizes, skewness, just not sure, want a second opinion, want to convince a skeptic, etc.) then you can use bootstrap or permutation methods which do not depend on the normality assumption.

Related Question