Distributions – Understanding the Distribution of p-Values Under the Null Hypothesis is Uniform(0, 1)

distributionsp-value

I have seen this, and this and a few YouTube videos, and I'm still stuck.

I understand how the probability integral transform gives rise to the result that the CDF of the p-values will have a uniform distribution.

What I don't understand is why that implies that the p-values themselves have a uniform distribution.

That is, I understand this much:

Suppose X ~ Unif(a, b). Then the CDF of X is:

$$P(X \le x) =
\begin{cases}
0\ \ {\rm{if}}\ \ x \le a \\
(x-a)/(b-a)\ \ {\rm{if}} \ \ a \le x < b \\
1\ \ {\rm{if}}\ x \ge b
\end{cases}$$

So if X ~ Unif(0, 1), then $$P(X \le x) = x$$ (just substituting a=0 and b=1).

Now suppose $$Y = F(X)$$, and we want to know the probability distribution of Y. That is, we want to know the probability distribution of the CDF of X.

We know that the CDF of a distribution is a unique identifier of a distribution, so if you see, for example, $P(X \le x) = x$ then you know X ~ Unif(0, 1).

We also know that CDFs are right-continuous, and they go from 0 to 1. So it's reasonable to pick a value, f, that lies between 0 and 1 and try to find the probability that the CDF, Y, takes a value less than or equal to f:

$$\begin{align*}
P(Y \le f) &= P(F(X) \le f) \\
&= P(X \le F^{-1}(f)) \ {\rm{assuming\ F\ is\ invertible}} \\
&= F(F^{-1}(f)) \\
&= f
\end{align*}$$

So since $P(Y \le f) = f, Y = F(X)$ must follow a uniform distribution.

This implies that for any continuous random variable (that satisfies some properties that I'm not sure of), the CDF of that continuous random variable will have a Unif(0, 1) distribution.

It does NOT imply that the random variable itself has a Unif(0, 1) distribution. That is, it does not mean that X has a Unif(0, 1) distribution, only that F(X) has a Unif(0, 1) distribution.

So if a test statistic has a continuous distribution, then the CDF of that test statistic has a Unif(0, 1) distribution. Why does this mean that the p-values have a uniform distribution?

Wait…are p-values the CDF of a test statistic?

Clearly I'm tying myself in knots here. Any help would be appreciated.

EDIT (responding to a comment):

Here's my line of thinking since sleeping on it.

If we have $P(X \le x) = x$, then X ~ Unif(0,1).

Since $P(F(X) \le f) = f$, that means $F(X)$ ~ Unif(0,1), right?

But why does this lead us to think that p-values are uniformly distributed if the null hypothesis is true?

Suppose for instance we have:

$$H_0: \mu \ge 0$$
$$H_a: \mu < 0$$,

and $\sigma$ is known. Let $ts$ be the test statistic, which has a non-standard normal distribution. After standardization, let the z-score associated with the test statistic be $z_{ts}$.

Then we would reject $H_0$ if $P(Z < z_{ts}) < 0.05$. That is, we would reject $H_0$ if the p-value is less than 0.05.

The form $P(Z < z_{ts})$ is the same kind of form as a CDF, right? If the test statistic is continuous then this is the same as $P(Z \le z_{ts})$.

Now let $F(Z) = P(Z \le z_{ts})$.

Is this really a CDF? If so, then what?

What about when we have other alternative hypotheses (like $H_a: \mu > 0$ or $H_a: \mu \ne 0$)?

Best Answer

In hypothesis testing, we compute the test statistic and ask, 'what is the probability of seeing something as or more extreme than this observation'.

Consider a test where the alternate hypothesis is something being 'greater'. In the context of greater alternate, this becomes probabilities of seeing the observed test stat or anything greater than it.

In other words, the p_value is the survival function of the test statistic under the null. So, if our test statistic is $x$ and the null hypothesis involves it distributed according to $X_0$, the p_value becomes (for the test where the alternate is 'greater' and assuming $S_{X_0}$ is the survival function of $X_0$):

$$q=P(\text{Observation as or more extreme than x under null in direction of alternate})$$

$$=P(X_0>x)=S_{X_0}(x)$$

But if the null hypothesis is true, the test statistic, $x$ itself is drawn from the distribution of the null. And we said the distribution of the test statistic under the null is $X_0$. The distribution of the p_value is then given by a random variable $Q$ such that:

$$Q=S_{X_0}(X_0)$$

But we know that if we apply the survival function (or CDF) of a random variable to itself, we get a U(0,1) distribution. This is the basis of the inverse transform sampling technique and Q-Q plots.

Here is a proof:

$$P(Q<q)=P(S_{X_0}(X_0)<q)=P(X_0>S_{X_0}^{-1}(q))=S_{X_0}(S_{X_0}^{-1}(q))=q$$

Where we used in the third expression the fact that the survival function is monotonically decreasing.

But if $P(Q<q)=q$ then $Q$ must be $U(0,1)$.

Related Question