Pearson correlation test

correlationhypothesis testingstatistical-inference

I'm trying to run a hypothesis test to see if two variables are correlated. Therefore, I would like to use the Pearson correlation test. I know how to set the significance level, the null hypothesis and the alternative hypothesis and I also know how to calculate the r-value. But I'm not sure how to calculate the critical values for r and thus the rejection region.

Du to wikipedia the r-value follows a student t distribution with n-2 degrees of freedom where n is the sample size: https://en.wikipedia.org/wiki/Pearson_correlation_coefficient

I also found a table of critical r-values for different sample sizes and significance levels:
http://statisticslectures.com/tables/rtable/

Calculating the critical value using the student t distribution and a significance level I get different values than shown in the table. Am I missing something? Or am I getting the whole idea of this test wrong?

Best Answer

The critical value for $r$ as shown in the table, is given by the equation $$r_{\text{crit}} = \frac{t_{n-2,\alpha/2}^*}{\sqrt{n-2 + (t_{n-2,\alpha/2}^*)^2}},$$ where $t_{n-2,\alpha/2}^*$ is the upper $\alpha/2$ quantile of the student's $t$-distribution with $n-2$ degrees of freedom. This equation is the second one in the Wikipedia article subsection under "Testing using Student's $t$-distribution." In particular, $$\Pr[T_{n-2} > t_{n-2,\alpha/2}^*] = \alpha/2$$ where $T_{n-2}$ is a student's $t$ random variable with $n-2$ degrees of freedom. For instance, $n = 10$ and $\alpha = 0.1$ gives $t_{8,0.05}^* \approx 1.85955$. Then $$r_{\text{crit}} = \frac{1.85955}{\sqrt{10 + (1.85955)^2}} \approx 0.549357,$$ which is the entry for row $8$ column $2$ in the table.

For the case in your comment, where $n = 10$ and $\alpha = 0.05$, we have $$t_{8, 0.025}^* \approx 2.306004.$$ This gives $$r_{\text{crit}} = \frac{2.306004}{\sqrt{8 + (2.306004)^2}} \approx 0.63189686.$$ This is row $8$ column $3$ of the table.

This hypothesis test is 2-sided. If $r$ is negative, then you need to compare against $-r_{\text{crit}}$ for the hypothesis $$H_0 : r = 0 \quad \text{vs.} \quad H_1 : r \ne 0.$$ That is to say, we reject $H_0$ in favor of $H_1$ if $|r| > r_{\text{crit}}$ where $r$ is the observed correlation from the data.

Personally, I don't find the table very useful. Instead, I would directly calculate the test statistic via the first equation in the Wikipedia subsection $$T_{n-2} \mid H_0 = r \sqrt{\frac{n-2}{1-r^2}},$$ which is student $t$-distributed with $df = n-2$, thus making the need for a separate table irrelevant, since you can now just use a regular $t$-table.

Related Question