Solved – If any parametric test does not reject null, does its nonparametric alternative do the same

hypothesis testingnonparametric

If nonparametric tests are assumed to have less power than their parametric alternatives, does this imply that if any parametric test does not reject null, then its nonparametric alternative does not reject null too? How can this change if assumptions of parametric test are not met and the test is used anyway?

Best Answer

If a parametric test fails to reject the null hypothesis then its nonparametric equivalent can definitely still reject the null hypothesis. Like @John said, this usually occurs when assumptions that would warrant use of the parametric test are violated. For example, if we compare the two-sample t-test with the Wilcoxon rank sum test then we can get this situation to happen if we include outliers in our data (with outliers we should not use the two sample-test).

#Test Data
x = c(-100,-100,rnorm(1000,0.5,1),100,100)
y = rnorm(1000,0.6,1)

#Two-Sample t-Test
t.test(x,y,var.equal=TRUE)

#Wilcoxon Rank Sum Test
wilcox.test(x,y)

The results of running the test:

> t.test(x,y,var.equal=TRUE)

    Two Sample t-test

data:  x and y 
t = -1.0178, df = 2002, p-value = 0.3089
alternative hypothesis: true difference in means is not equal to 0 
95 percent confidence interval:
 -0.6093287  0.1929563 
sample estimates:
mean of x mean of y 
0.4295556 0.6377417 

> 
> wilcox.test(x,y)

    Wilcoxon rank sum test with continuity correction

data:  x and y 
W = 443175, p-value = 5.578e-06
alternative hypothesis: true location shift is not equal to 0

Related Solutions

Solved – Why does rejecting the null in goodness-of-fit tests not imply accepting the null

Question 3: that depends on the goodness of fit test. So it is always a good idea to read up on the specific goodness of fit test you want to apply to figure out what exactly the null hypothesis is that is being tested.

Question 2: To understand this you need to see that a goodness of fit test is just like any other statistical test, and understand exactly what the logic is behind statistical tests. The outcome of a statistical test is a $p$-value, which is the probability of finding data that deviates from $H_0$ at least as much as the data you have observed when $H_0$ is true. So it is a thought experiment with the following steps:

Assume a population in which $H_0$ is true, that is, your model is correct in some specific sense depending on the goodness of fit test.
We draw many samples at random from this population, fit the model, and compute the goodness of fit test in each of these samples.
Since you have drawn samples at randome, some of these samples will be "weird", i.e. deviate from $H_0$.
The $p$-value is the expected proportion of samples that are "as weird or weirder" than the data you have observed.

If you find data with a small $p$-value then that data is unlikely to have come from a population in which the $H_0$ is true, and the fact that you have observed that data is considered evidence against $H_0$. If the $p$-value is below some pre-defined but arbitrary cut off point $\alpha$ (common values are 5% or 1%), then we call it "siginificant" and reject the $H_0$.

Notice what the opposite, not-significant, means: we have not found enough information to reject $H_0$. This is a case of "absence of evidence", which is not the same thing as "evidence of absence". So, "not rejecting $H_0$" is not the same thing as "accepting $H_0$".

Another way to answer your question would be to ask: "could it be that the $H_0$ is true?" the answer is simply no. In a goodness of fit test, the $H_0$ is that the model is in some sense true. The definition of a model is that it is a simplification of reality and simplification is just an other word for "wrong in some useful way". So models are by definition wrong, and thus the $H_0$ cannot be true.

This has consequences for the statement you quoted: "If reject $H_0$ then we conclude we should not use the model." This is incorrect, all that the significance of a goodness of fit test tells you that your model is likely to be wrong, but you already knew that. The interesting question is whether it is so wrong that it is no longer useful. This is a judgement call. Statistical tests can help you in differentiating between patterns that could just be the result of the randomness that is the result of sampling and "real" patterns. A significant result tells you that the latter is likely to be true, but that is not enough to conclude that the model is not a useful simplification of reality. You now need to investigate what exactly the deviation is, how large that deviation is, and what the concequences are for the performance of your model.

Hypothesis Testing – Why Non-Significant Results Mean ‘You Can’t Reject the Null’ Instead of Accepting the Null Hypothesis

Traditionally, the null hypothesis is a point value. (It is typically $0$, but can in fact be any point value.) The alternative hypothesis is that the true value is any value other than the null value. Because a continuous variable (such as a mean difference) can take on a value which is indefinitely close to the null value but still not quite equal and thus make the null hypothesis false, a traditional point null hypothesis cannot be proven.

Imagine your null hypothesis is $0$, and the mean difference you observe is $0.01$. Is it reasonable to assume the null hypothesis is true? You don't know yet; it would be helpful to know what our confidence interval looks like. Let's say that your 95% confidence interval is $(-4.99,\ 5.01)$. Now, should we conclude that the true value is $0$? I would not feel comfortable saying that, because the CI is very wide, and there are many, large non-zero values that we might reasonably suspect are consistent with our data. So let's say we gather much, much more data, and now our observed mean difference is $0.01$, but the 95% CI is $(0.005,\ 0.015)$. The observed mean difference has stayed the same (which would be amazing if it really happened), but the confidence interval now excludes the null value. Of course, this is just a thought experiment, but it should make the basic ideas clear. We can never prove that the true value is any particular point value; we can only (possibly) disprove that it is some point value. In statistical hypothesis testing, the fact that the p-value is > 0.05 (and that the 95% CI includes zero) means that we are not sure if the null hypothesis is true.

As for your concrete case, you cannot construct a test where the alternative hypothesis is that the mean difference is $0$ and the null hypothesis is anything other than zero. This violates the logic of hypothesis testing. It is perfectly reasonable that it is your substantive, scientific hypothesis, but it cannot be your alternative hypothesis in a hypothesis testing situation.

So what can you do? In this situation, you use equivalence testing. (You might want to read through some of our threads on this topic by clicking on the equivalence tag.) The typical strategy is to use the two one sided tests approach. Very briefly, you select an interval within which you would consider that the true mean difference might as well be $0$ for all you could care, then you perform a one-sided test to determine if the observed value is less than the upper bound of that interval, and another one-sided test to see if it is greater than the lower bound. If both of these tests are significant, then you have rejected the hypothesis that the true value is outside the interval you care about. If one (or both) are non-significant, you fail to reject the hypothesis that the true value is outside the interval.

For example, suppose anything within the interval $(-0.02,\ 0.02)$ is so close to zero that you think it is essentially the same as zero for your purposes, so you use that as your substantive hypothesis. Now imagine that you get the first result described above. Although $0.01$ falls within that interval, you would not be able to reject the null hypothesis on either one-sided t-test, so you would fail to reject the null hypothesis. On the other hand, imagine that you got the second result described above. Now you find that the observed value falls within the designated interval, and it can be shown to be both less than the upper bound and greater than the lower bound, so you can reject the null. (It is worth noting that you can reject both the hypothesis that the true value is $0$, and the hypothesis that the true value lies outside of the interval $(-0.02,\ 0.02)$, which may seem perplexing at first, but is fully consistent with the logic of hypothesis testing.)

Best Answer

Related Solutions

Solved – Why does rejecting the null in goodness-of-fit tests not imply accepting the null

Hypothesis Testing – Why Non-Significant Results Mean ‘You Can’t Reject the Null’ Instead of Accepting the Null Hypothesis

Related Question