Correlation – Calculating p-value for Weighted Pearson Correlation Coefficient

correlationp-valuepearson-rstatistical significance

I'm computing a weighted correlation coefficient, using the method described here.

I'd like to compute a p-value for the resulting r coefficient. How can I do this correctly, given that my r was computed using weights? Naturally, the standard formula for p-value of r (e.g., here) does not take weights into account, and I'm not sure how to properly account for weights when computing the p-value.

Best Answer

The $P$-value reported for a correlation depends on the sample correlation, the sample size, and a bundle of assumptions not always checked (independence being, in my experience, least checked of all). But there is a difference between a crude $t$-based $P$-value based on a null hypothesis of zero correlation and a more general $P$-value based on Fisher's $z$ transformation.

I don't think there is an answer to this independent of what the weights are. If weighting means that you are combining data from different subsamples, then the weights have implications for the sample size that should be used; at the same time correlations based on weighted combinations would not necessarily have the same distribution as the correlation distribution based on raw data.

At the same time, it is difficult to get agitated about this. If correlations have a point it is that they measure strength of relationship; if you are seriously in doubt that they are significantly different from zero, then it is arguable that you just have inadequately small samples and being precise about that problem is secondary.

It's likely that this misreads your problem, in which case you may have to give much more detail.

If getting really reliable $P$-values for weighted correlations is important to you, it is possible that you need to get a handle on it through simulation, including simulation of the weighting process if that is variable too.