Solved – Using model with heteroskedasticity for predictions

heteroscedasticityrregressionrobust-standard-error

I am running a model with ordinary least squares regression, and am using robust standard errors (RSE's) because diagnostics tests indicated heteroskedasticity of the model. I'm a bit limited in the kinds of regression I can use because the data is non-normally distributed (I tried a few transformations but nothing really helped).

I will also be graphing the relationship between predicted values of y and values of different independent variables. I know the incorrect standard error issue affects interval estimates, but the coefficient values used in the regression equation remain the same after adjusting standard errors, so I am not sure if it would affect the predicted y values as well. My question is, since the model itself is still heteroskedastistic, would it still be "valid" to use the model for predictions? If not, is there a way I could adjust the actual model in R so that I could use it with the predict() function?

If it helps to have an idea of what I'm working with, here is the model without adjusted standard errors:

Call:
lm(formula = ortho ~ forb + sfdist + year, data = insect.ortho)

Residuals:
Min      1Q  Median      3Q     Max 
-5.2720 -1.5416 -0.6649  0.7500 18.1564 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)  3.140313   0.367324   8.549 3.76e-15 ***
forb         0.218135   0.051850   4.207 3.96e-05 ***
sfdist      -0.001098   0.000373  -2.943  0.00365 ** 
year        -0.762830   0.401225  -1.901  0.05876 .  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 2.603 on 193 degrees of freedom
Multiple R-squared: 0.149, Adjusted R-squared: 0.1358
F-statistic: 11.27 on 3 and 193 DF, p-value: 7.621e-07

And after using robust standard errors:

t test of coefficients:

             Estimate  Std. Error t value  Pr(>|t|)    
(Intercept)  3.14031323  0.37927174  8.2799 2.017e-14 ***
forb         0.21813498  0.06895898  3.1633  0.001813 ** 
sfdist      -0.00109765  0.00033704 -3.2567  0.001331 ** 
year        -0.76282962  0.39007561 -1.9556  0.051956 .  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Best Answer

The prediction will not be altered in any way by using het-robust standard errors. It remains the same and is still valid.

It's the interval around that prediction (and any hypothesis tests about coefficients or the predictions) that will changed by the choice of of whether to use the het-robust errors or not. In general, if you have heteroskedasticity and use the non-het-robust errors, your intervals will be too small.