Solved – Difference between p value and r

correlationlinearp-valueregression

In a linear regression the coefficient of correlation, r, varies between -1 and +1. If the p-value value is under the significance level, we have to reject the null hypothesis, the null-hypothesis being here that there is no linear relationship between 2 variables. It is tempting to think that when the correlation is close to -1 or +1, the association is strong whereas it is not when it is close to 0. However I think I know it is not true because only the p-value confirms the fact that the correlation is statistically significant if the p-value is under the significance level. How can one intuitively explain the difference between the p-value and the r value (example: a linear regression between 2 variables where possible value of R and p-value would be r = 0.98 and p = 0.14)?

Best Answer

The $p$-value takes into account both the strength of the correlation $r$ as well as the number of samples. For example, if you had only two samples, you could easily fit a line through them, but your $p$-value would be large because 2 samples just aren't enough to tell you what's going on.

Or in your example with $r = 0.98, p = 0.14$, how many samples do you have? If even with a nearly perfect correlation you do not get statistical significance that tells you that you might as well never have bothered to collect the data because no matter how strong the association you wouldn't be able to verify it.

To avoid something like this, one can determine how many samples one needs before even conducting the analysis.