The p-value of your significance test can be interpreted as the probability of observing the value of the relevant statistic as or more extreme than the value you actually observed, given that the null hypothesis is true. (note that the p-value makes no reference to what values of the statistic are likely under the alternative hypothesis)
EDIT: in mathematical terminology, this can be written as:
$$p-value = Pr(T > T_{obs} | H_{0})$$
where $T$ is some function of the data (the "statistic") and $T_{obs}$ is the actual value of $T$ observed; $H_{0}$ denotes the conditions implied by the null hypothesis on the sampling distribution of $T$.
You can never be sure that you're assumptions hold true, only whether or not the data you observed is consistent with your assumptions. A p-value gives a rough measure of this consistency.
A p-value does not give the probability that the same data will be observed, only the probability that the value of the statistic is as or more extreme to the value observed, given the null hypothesis.
No, the data are not heteroscedastic (by way of how you simulated them). Did you notice the 0 degrees of freedom of the test? That is a hint that something is going wrong here. The B-P test takes the squared residuals from the model and tests whether the predictors in the model (or any other predictors you specify) can account for substantial amounts of variability in these values. Since you only have the intercept in the model, it cannot account for any variability by definition.
Take a look at: http://en.wikipedia.org/wiki/Breusch-Pagan_test
Also, make sure you read help(bptest)
. That should help to clarify things.
One thing that is going wrong here is that the bptest()
function apparently does not test for this errant case and happens to throw out a tiny p-value. In fact, if you look carefully at the code underlying the bptest()
function, essentially this is happening:
format.pval(pchisq(0,0), digits=4)
which gives "< 2.2e-16"
. So, pchisq(0,0)
returns 0
and that is turned into "< 2.2e-16"
by format.pval()
. In a way, that is all correct, but it would probably help to test for zero dfs in bptest()
to avoid this sort of confusion.
EDIT
There is still lots of confusion concerning this question. Maybe it helps to really show what the B-P test actually does. Here is an example. First, let's simulate some data that are homoscedastic. Then we fit a regression model with two predictors. And then we carry out the B-P test with the bptest()
function.
library(lmtest)
n <- 100
x1i <- rnorm(n)
x2i <- rnorm(n)
yi <- rnorm(n)
mod <- lm(yi ~ x1i + x2i)
bptest(mod)
So, what is really happening? First, take the squared residuals based on the regression model. Then take $n \times R^2$ when regressing these squared residuals on the predictors that were included in the original model (note that the bptest()
function uses the same predictors as in the original model, but one can also use other predictors here if one suspects that the heteroscedasticity is a function of other variables). That is the test statistic for the B-P test. Under the null hypothesis of homoscedasticity, this test statistic follows a chi-square distribution with degrees of freedom equal to the number of predictors used in the test (not counting the intercept). So, let's see if we can get the same results:
e2 <- resid(mod)^2
bp <- summary(lm(e2 ~ x1i + x2i))$r.squared * n
bp
pchisq(bp, df=2, lower.tail=FALSE)
Yep, that works. By chance, the test above may turn out to be significant (which is a Type I error since the data simulated are homoscedastic), but in most cases it will be non-significant.
Best Answer
The aim of the B-P test is to assess whether the residuals in a linear model have constant variance, by regressing the square of the residuals on the independent variables. Bartlett's test seeks to determine whether multiple samples come from populations that all have the same variance. You could view the latter as a special case of the former, in thinking of the linear model that corresponds to a one-way ANOVA, but the details of the test statistics are rather different. I would agree that they are related, and that you could apply the former in the context in which the latter is generally used, with potentially different results.