Solved – RESET Test in R Influenced by Heteroskedasticity in the Data

datasetheteroscedasticitynegative-binomial-distributionregression

I'm running a negative binomial model in R on 558 observations of count data, along with "vcov" to add robust errors. I am using the Ramsey RESET Test as a criteria to judge my model. I have always understood this test looks at functional form, not at heteroskedasticity.

However, after completing dozens of transformations on the existing variables in my model and doing just about everything to account for non-linearity, I continue to get p<0.05 and thus rejection of the model. An omitted variable also cannot be a conclusion from the RESET test, according to literature.

I suspect the heteroskedasticity is causing the problem with the RESET test. Because robust errors using vcov are simply added to the model, I sense the RESET test is running on the original model that does not have corrected standard errors. I am especially suspicious because, just to see, after I removed many observations at the extremes and reduced some of the heteroskedasticity, the p value started creeping up toward 0.10 and the acceptable range.

Anyone else had any experience with the RESET test and whether it could be influenced by heteroskedasticity in the data? Also, just how popular is this test and how applicable is it with generalized linear models like negative binomial, and could it be less relevant because we are dealing with count data?

**In case it may be helpful, there are approximately 350 instance of 0 in the dependent variable, 100 instances of 1, 60 instances of 2, 25 instances of 3, 15 instances of 4, 10 instances of 5, 13 instances of 6, 4 instances of 7, 2 instances of 8, 1 instance of 9, 2 instances of 10, 1 instance of 12, and 1 instance of 15. I've looked into zero-inflated, hurdle, Poisson and negative binomial models, and the negative binomial appears to fit best.

Best Answer

A late answer for those who have had the same doubt. The answer is yes, heteroskedasticity in the auxiliary equation of the test can bias the RESET test. The reason is that the RESET test works as follows [see Cameron and Trivedi (2005)]:

  1. Consider the regression $\bf{y} = \bf{x}'\bf{\beta} + \bf{u}$, where we assume that the regressors $\bf{x}$ enter linearly and are asymptotically uncorrelated with the error u;
  2. Fit the initial regression and generate new regressors that are functions of the fitted values $\hat{\bf{y}} = \bf{x'} \hat{\bf{\beta}}$;
  3. Estimate the original model adding the p new regressors (i.e. the auxiliary regression) and use an F-test (or Wald test) with p restrictions; viz, $H_0$: the parameters associated with the new regressors are equal to zero.

It follows that if the auxiliary regression is heteroskedastic, the RESET test might be biased. This is due to the fact that heteroskedasticity biases the standard errors, hence invalidating the F-test on which the RESET test relies.

Related Question