Solved – Interpretation of Breusch-Pagan test bptest() in R

heteroscedasticityhypothesis testing

I was a bit confused regarding the interpretation of bptest in R (library(lmtest)). The null hypothesis of bptest is that the residuals have constant variance. So, a p-value less than 0.05 would mean that the homoscedasticity assumption would have to be rejected. However on this website:

http://rstatistics.net/how-to-test-a-regression-model-for-heteroscedasticity-and-if-present-how-to-correct-it/

I found the following results that confuse me:

data:  lmMod
BP = 3.2149, df = 1, p-value = 0.07297

A p-Value > 0.05 indicates that the null hypothesis(the variance is unchanging in the residual) can be rejected and therefore heterscedasticity exists. This can be confirmed by running a global validation of linear model assumptions (gvlma) on the lm object.

gvlma(lmMod) # validate if assumptions of linear regression holds true.

# Call:
 gvlma(x = lmMod) 

                    Value  p-value                   Decision
Global Stat        15.801 0.003298 Assumptions NOT satisfied!
Skewness            6.528 0.010621 Assumptions NOT satisfied!
Kurtosis            1.661 0.197449    Assumptions acceptable.
Link Function       2.329 0.126998    Assumptions acceptable.
Heteroscedasticity  5.283 0.021530 Assumptions NOT satisfied!

So why is it the case that a p-value>0.05 means you have to reject the null hypothesis, when in fact a p-value less than 0.05 indicates that you have to reject the null hypothesis?

Best Answer

This ought to be a typo on rstatistics.net. You are correct that the null hypothesis of the Breusch-Pagan test is homoscedasticity (= variance does not depend on auxiliary regressors). If the $p$-value becomes "small", the null hypothesis is rejected.

I would recommend contacting the authors of rstatistics.net regarding this issue to see if they agree and fix it.

Moreover, note that glvma() employs a different auxiliary regressor than bptest() by default and switches off studentization. More precisely, you can see the differences if you replicate the results by setting the arguments of bptest() explicitly.

The model is given by:

data("cars", package = "datasets")
lmMod <- lm(dist ~ speed, data = cars)

The default employed by bptest() then uses the same auxiliary regressors as the model, i.e., speed in this case. Also it uses the studentized version with improved finite-sample properties yielding a non-significant result.

library("lmtest")
bptest(lmMod, ~ speed, data = cars, studentize = TRUE)
##  studentized Breusch-Pagan test
## 
## data:  lmMod
## BP = 3.2149, df = 1, p-value = 0.07297

In contrast, the glvma() switches off studentization and checks for a linear trend in the variances.

cars$trend <- 1:nrow(cars)
bptest(lmMod, ~ trend, data = cars, studentize = FALSE)
##  Breusch-Pagan test
## 
## data:  lmMod
## BP = 5.2834, df = 1, p-value = 0.02153

As you can see both $p$-values are rather small but on different sides of 5%. The studentized versions are both slightly above 5%.