Solved – Best way to deal with heteroscedasticity

generalized linear modelheteroscedasticitylmr

I have a plot of residual values of a linear model in function of the fitted values where the heteroscedasticity is very clear. However I'm not sure how I should proceed now because as far as I understand this heteroscedasticity makes my linear model invalid. (Is that right?)

  1. Use robust linear fitting using the rlm() function of the MASS package because it's apparently robust to heteroscedasticity.

  2. As the standard errors of my coefficients are wrong because of the heteroscedasticity, I can just adjust the standard errors to be robust to the heteroscedasticity? Using the method posted on Stack Overflow here: Regression with Heteroskedasticity Corrected Standard Errors

Which would be the best method to use to deal with my problem? If I use solution 2 is my predicting capability of my model completely useless?

The Breusch-Pagan test confirmed that the variance is not constant.

My residuals in function of the fitted values looks like this:

https://i.gyazo.com/9407a829a168492b31dfa3d1dd33a21d.png

(larger version)

Best Answer

It's a good question, but I think it's the wrong question. Your figure makes it clear that you have a more fundamental problem than heteroscedasticity, i.e. your model has a nonlinearity that you haven't accounted for. Many of the potential problems that a model can have (nonlinearity, interactions, outliers, heteroscedasticity, non-Normality) can masquerade as each other. I don't think there's a hard and fast rule, but in general I would suggest dealing with problems in the order

outliers > nonlinearity > heteroscedasticity > non-normality

(e.g., don't worry about nonlinearity before checking whether there are weird observations that are skewing the fit; don't worry about normality before you worry about heteroscedasticity).

In this particular case, I would fit a quadratic model y ~ poly(x,2) (or poly(x,2,raw=TRUE) or y ~ x + I(x^2) and see if it makes the problem go away.

Related Question