Solved – Reporting coefficient of determination using Spearman’s rho

correlationeffect-sizepearson-rregressionspearman-rho

I have two non-normally distributed variables (positively-skewed, exhibiting ceiling effects). I would like to calculate the correlation coefficient between these two variables. Due to the non-normal distribution, I used Spearman's rank-order correlation, which returns a correlation coefficient and a significance (p) value. My results (n=400) show a significant ($p = 8 \times 10^{-5}$) but weak correlation (Spearman's $\rho$ = .20). If one uses Pearson's, one could describe the strength of the correlation in terms of shared variance (coefficient of determination, $R^2$ – in my case $R^2$ = .04, ie. 4%). Clearly, for Spearman's, it doesn't seem meaningful to square the $\rho$ value as Spearman ranks data. What is the best way to talk about the effect size using Spearman's $\rho$?

Alternatively, following a discussion here (Pearson's or Spearman's correlation with non-normal data), I interpret the discussions as meaning that Pearson's correlation does not assume normality, but calculating p-values from the correlation coefficients does. Thus, I was wondering whether one could use Spearman to calculate the p-value of the correlation, and Pearson's to calculate the effect size, and thus continue to speak of shared variance between the two variables.

Best Answer

Pearson's r and Spearman's rho are both already effect size measures. Spearman's rho, for example, represents the degree of correlation of the data after data has been converted to ranks. Thus, it already captures the strength of relationship.

People often square a correlation coefficient because it has a nice verbal interpretation as the proportion of shared variance. That said, there's nothing stopping you from interpreting the size of relationship in the metric of a straight correlation.

It does not seem to be customary to square Spearman's rho. That said, you could square it if you wanted to. It would then represent the proportion of shared variance in the two ranked variables.

I wouldn't worry so much about normality and absolute precision on p-values. Think about whether Pearson or Spearman better captures the association of interest. As you already mentioned, see the discussion here on the implication of non-normality for the choice between Pearson's r and Spearman's rho.