Solved – Testing the significance between two correlations in Python.

bootstrapcorrelationpearson-rpython

I've seen some similar articles using R but I'm not sure how to implement this test in python.

Suppose I have two datasets, A and B (both contain 35 datapoints), and they predict some time series of C (also 35 datapoints).

A's correlations with C is r=0.27, B's correlation is r=0.34. This suggests that B explains about 4-5% more variance than A. How can I test to see if A and B are significantly different in predicting C?

I'm guessing some kind of bootstrap may work to see in the 5-95% tails of the distributions overlap, but not sure how to do this.

Best Answer

Fortunately, Philipp Singer has implemented some Python functions for calculating the statistical significance of differences between two dependent or independent Pearson correlation coefficients. Check out the CorrelationStats repository on GitHub.

In your example, if A and B are drawn from independent samples, then their relationship with C is independent of each another. You can therefore compare the two Pearson r values by applying a Ronald Fisher z-transformation. This transformation normalises the r values so they are directly comparable.

Conversely, if A, B and C are obtained from the same sample of the population, then you need to account for the fact that each variable is not independent. CorrelationStats also includes a couple of methods for comparing two dependent correlation coefficients.

Related Question