Solved – Checking for normality with robust errors

normality-assumptionrobust-standard-errorstata

I am running a linear regression (just a single IV) and have selected the robust error option (vce robust) in Stata due to heteroscedasticity (and because it is sometimes recommended to do so anyway). However, try as I might, I cannot find any advice on whether I should be testing for normality after I have selected this robust option or whether running the robust option negates the need to do so. Any help on whether normality should be tested with this option checked would be greatly appreciated.

BASED ON ANSWERS: My main focus is on understanding the regression model, so I will be looking at the (slope) coefficient and its 95% CI as well as statistical significance (I have a continuous IV). In another linear regression I had hoped to make predictions (with CI and, hopefully, PI) also. I was OK with checking the assumptions of a regression analysis until I reached the option to use robust standard errors. From the answers received am I correct in saying that asymptotic normality is needed, but not readily/easily tested for (and is rarely tested in practice)? So I could run the regression with robust errors and not test for normality. I assume that other assumptions (e.g., unusual points) still hold. I checked Stata and it does seem that it predicts when robust errors are used. Is it correct to use these predictions?

Best Answer

To make things simple, suppose you have 3 observations. Robust standard errors allow for a variance-covariance matrix of the errors to look like this: $$\Sigma = \begin{bmatrix} \sigma_{1} & 0 & 0\\ 0 & \sigma_{2} & 0 \\ 0 & 0 & \sigma_{3} \end{bmatrix} $$

The diagonal terms are the variances of the errors for each of the 3 observations. The covariance terms are all zero because we still assume that the errors are uncorrelated across observations. If you want to relax that, you will need cluster-robust errors.

Ordinary, non-robust errors assume that $\sigma_1=\sigma_2=\sigma_2=\sigma$: all observations have the same (unknown) error variance. Neither the robust nor the non-robust VCE make any assumptions about the distribution of the error, such as normality. They only make assumptions about the variance being the same (homoskedasticity) or different across observations (heteroskedasticity).

Thus, depending on what you are doing, you may still need to test for normality of the errors. Alternatively, you may be able to bootstrap or rely on asymptotics for inference.

Related Question