Solved – Interpretation of Spearman’s rank correlation coefficient – beyond its significance

correlationspearman-rho

I calculated the Spearman's rank correlation coefficient interpretation for a given 2D dataset. I then tested its significance by doing a permutation test and obtained a p-value.

I have a problem with the interpretation of the coefficient value. While I understand that a Spearman's rank coef. value should not be mistaken/be interpreted as giving information about its significance, I still do not have a simple interpretation for the coefficient value. The significance test shows us basically how likely is the coefficient to be larger that the observed one when the Null Hypothesis is respected, but says nothing about the observed value that uses as starting point.
Can for instance a value of 0.60 mean that there are 60% more ranked pairs in my data set following a monotonically crescent discrete function than otherwise?

Best Answer

The Spearman's rank c. c. is the Pearson' c.c. of the ranked variables; in its turn the Pearson's c.c. is defined as the mean of the product of the paired standardized scores $z(X_i)$, $z(Y_i)$.

\begin{equation} r(X,Y) = \Sigma_i[z(X_i) z(Y_i)]/(n-1) \end{equation}

in which $n$ is the sample size and the standard scores

\begin{equation} z(X_i) = [X_i - \bar{X}]/std(X) \end{equation}

\begin{equation} z(Y_i) = [Y_i - \bar{Y}]/std(Y) \end{equation}

are relative to the ranked variables ($X_i$, $Y_i$). Squaring $r(X_i, Y_i)$ we obtain the coefficient of determination $r²$, which we can equate to the fraction of explained variance. So if my Spearman's rank c.c. is of 0.6, I can deduce that the variance of the ranked variables is shared at 36%.

From the first equation and attempting at a simpler way of explaining $r(X,Y)$, I would say is the average value of concordance of z-score variations. For instance, let us say I repeat an experiment by increasing the sample size $n$ and calculate $r(X,Y)$ for both the small sample and the larger one. Let us say that associated to an increase in n of $~3$ I get a decrease in $r(X,Y)$ of roughly 50%; this corresponds to a decrease in standard scores concordance of 50%. My interpretation should be then that the latest dataset provides weaker evidence for the presence of a correlation in the data.