Solved – Interpreting p-value significance from Pearson correlation

correlationMATLABp-valuepearson-r

I used matlab corr() function to identify correlation of 236 samples. Pearson correlation is selected, and the output return r and p-value. Two sets of samples returned different r & p-value.

May I know how to interpret the significance of correlation with the results below?

(a) The data has strong negative correlation, and it's significant as p-value is a lot lesser than 0.05 ( p << 0.05 )

r = -0.9383   
p = 6.7415e-110

(b) the data has weak positive correlation, and it's insignificant as p-value > 0.05.

r = 0.06800  
p = 0.2981  

Am I right?

Best Answer

You have interpreted these results correctly according to the conventional textbook scheme.

Personally, I am often not a fan of the standard way of thinking about p-values. (Mounting soapbox...) Firstly, it's worth considering that there are several valid ways to look at p-values. Fisher thought of them as a continuous measure of evidence against the null hypothesis, and Neyman & Pearson used them as the hub around which the decision making process turned. The most common way p-values seem to be used is not valid under either approach. The Neyman-Pearson framework has much to speak for it, in my opinion, but is primarily applicable in situations where there are theories that clearly posit two possible values, a null value (which could be $r_{null}=0$, but could be another number), and an alternative value ($r_{alt}$). In such a case, you could design your whole investigation around differentiating between those two values. This would entail specifying, among other things, $\alpha$ (the long-run type I error rate you're willing to live with), $\beta$ (the long-run type II error rate you're willing to live with), $N$ (the sample size), etc. In that context, it makes sense to me to say that something is 'significant' or 'not-significant'. However, I believe those situations are the minority of the cases. For example, for your second sample, I would say that you cannot conclude with more than 70% confidence that the correlation is positive. You will also want to examine your data and think about possible non-linearities and range restriction. (Stepping down from soapbox...)

Related Question