Hypothesis Testing – Can P-Values for Pearson’s Correlation Test Be Computed From Correlation Coefficient and Sample Size?

correlationfraud-detectionhypothesis testingp-value

Background: I read one article where authors report Pearson correlation 0.754 from sample size 878. Resulting p-value for correlation test is "two star" significant (i.e. p < 0.01).
However, I think that with such a large sample size, corresponding p-value should be less than 0.001 (i.e. three star significant).

  • Can p-values for this test be computed just from Pearson correlation coefficient and sample size?
  • If yes, how can this be done in R?

Best Answer

Yes, it can be done, if you use Fisher's R-to-z transformation. Other methods (e.g. bootstrap) can have some advantages but require the original data. In R (r is the sample correlation coefficient, n is the number of observations):

z <- 0.5 * log((1+r)/(1-r))
zse <- 1/sqrt(n-3)
min(pnorm(z, sd=zse), pnorm(z, lower.tail=F, sd=zse))*2

See also this post on my blog.

That said, whether it is .01 or .001 doesn't matter that much. As you said, this is mostly a function of sample size and you already know that the sample size is large. The logical conclusion is that you probably don't even need a test at all (especially not a test of the so-called ‘nil’ hypothesis that the correlation is 0). With N = 878, you can be quite confident in the precision of the estimate and focus on interpreting it directly (i.e. is .75 large in your field?).

Formally however, when you do a statistical test in the Neyman-Pearson framework, you need to specify the error level in advance. So, if the results of the test really matter and the study was planned with .01 as the threshold, it only makes sense to report p < .01 and you should not opportunistically make it p < .001 based on the obtained p value. This type of undisclosed flexibility is even one of the main reasons behind criticism of little stars and more generally of the way null-hypothesis significance testing is practiced in social science.

See also Meehl, P.E. (1978). Theoretical risks and tabular asterisks: Sir Karl, Sir Ronald, and the slow progress of soft psychology. Journal of Consulting and Clinical Psychology, 46 (4), 806-834. (The title contains a reference to these “stars” but the content is a much broader discussion of the role of significance testing.)