Solved – Differences between p-value, level of significane and size of a test

p-valuestatistical significance

Can anybody tell me the differences between those quantities. As I know,
p-value is the minimum level of significance for a test to reject the null hypothesis with the observed data.
level of significane is a number that is greater than or equal to the power function of the test among all possible values of the parameter when the null hypothesis is true.
size of a test is the maximum value of the power function of the test among all possible values of the parameter when the null hypothesis is true.
So p-value seems to be size of a test since they are both the minimum value of the level of significance. Level of significane seems to be size of a test since they're both a number that is greater or equal than the power function.
According to the inferences above, these three quantities refer to the same thing, can anybody show me the differences between them or correct me if I was wrong ?

Best Answer

level and size

Wikipedia has the following:

A test is said to have significance level $\alpha$ if its size is less than or equal to $\alpha$.

I agree with this. It also says:

the size of a test is [...] the probability of making a Type I error.

this is not quite always true. (It corrects it lower down in the article.)

In the case of a composite null hypothesis, the size is the supremum of the rejection rates over all the possibilities under the null.

Loosely, it's the largest rejection rate under the null.

Note that in the general case (ponder a potentially composite null and possibly discrete test statistic) we may not be able to actually attain a rejection rate of some pre-specified $\alpha$

  • e.g. consider a two-tailed sign test with n=18 -- you can get a rejection rate under the null of 3.1% or 9.6% but you can't actually get 5% unless you resort to devices like randomized tests, or

  • consider that the actual type I error rate may depend on where in the null we happen to be situated. For example, with a one-sided t-test where $H_0: \mu\leq 0$, if the true $\mu=-0.5$ the type $I$ error rate will generally be lower than it would be if $\mu=-0.03$.

So now consider I want a significance level of 5% with one tailed sign test with $n=18$ under the composite null $H_0: \tilde{\mu}\leq 0$ vs $H_1:\tilde{\mu}> 0$. Now if $\tilde{\mu}$ is actually $0$ then my type I error rate is just over 4.8%. On the other hand if $\tilde{\mu}$ is $<0$ then my type I error rate will be something smaller than 4.8%; lets say we are in a particular situation under the null (depending on the specifics of the distribution) and our type I error rate there is 3.2%. We'd have a test with a 5% significance level, a size of 4.81% and an actual type I error rate of 3.2% (though in practice we couldn't figure this last one out because we wouldn't know either the population shape or its median).

Note in particular that both size and level don't relate to the sample -- if we draw another random sample of the same size (and other relevant characteristics), we should not expect size or level to change.


p value

The p-value is the probability of obtaining a test statistic at least as extreme as the one we observed from the sample, if the null hypothesis were true.

So by contrast with the other two things, the p-value is a function of the sample. New sample, new p-value.

It may be less than or greater than the the type I error rate, the size or the significance level.

Related Question