I used three methods (M1, M2 and M3) to generate rankings, which is the result
database.
result<-structure(list(n = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27,
28, 29), M1 = c(29L, 1L, 28L, 27L, 25L, 26L, 24L, 20L, 21L,
22L, 23L, 15L, 12L, 17L, 18L, 19L, 16L, 13L, 14L, 5L, 6L, 7L,
8L, 9L, 10L, 11L, 4L, 2L, 3L), M2 = c(1, 29, 28, 27, 26, 25,
24, 23, 22, 21, 20, 15, 12, 19, 18, 17, 16, 14, 13, 11, 10, 9,
8, 7, 6, 5, 4, 3, 2), M3 = c(1L, 29L, 28L, 27L, 25L, 26L, 24L,
20L, 21L, 22L, 23L, 15L, 12L, 17L, 18L, 19L, 16L, 13L, 14L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 4L,
2L, 3L)), class = "data.frame", row.names = c(NA,-29L))
> result
n M1 M2 M3
1 1 29 1 1
2 2 1 29 29
3 3 28 28 28
4 4 27 27 27
5 5 25 26 25
6 6 26 25 26
7 7 24 24 24
8 8 20 23 20
9 9 21 22 21
10 10 22 21 22
11 11 23 20 23
12 12 15 15 15
13 13 12 12 12
14 14 17 19 17
15 15 18 18 18
16 16 19 17 19
17 17 16 16 16
18 18 13 14 13
19 19 14 13 14
20 20 5 11 5
21 21 6 10 6
22 22 7 9 7
23 23 8 8 8
24 24 9 7 9
25 25 10 6 10
26 26 11 5 11
27 27 4 4 4
28 28 2 3 2
29 29 3 2 3
Now, I would like to use the Spearman's rank correlation considering this database above. Therefore, Spearman's rank correlation coefficient between the $k$th and $i$th methods is calculated by the following equation:
$$\rho_{ki} = 1 – \frac{6\sum{d_i^2}}{n(n^2-1)},$$
where $n$ is the number of alternatives and $d_i$ is the difference between the ranks of two methods.
Can you help me solve this issue above?
Without using the cor
function, it would look like this?
dif <- result %>%
mutate(D1 = M1-M2, D2 = M1-M3, D3 = M2-M3)
d <-dif$D1
rho <- function(d) {
1 - (6 * (sum(d)^2) / (length(d) * ((length(d)^2) - 1)))
}
rho(d)
Best Answer
Since you have already produced the ranks, you can take the Pearson correlation of these rank-transformed data to obtain the Spearman correlation. Only using very basic functions in R, which seems to be what you want to do, you could do:
sum((M1-mean(M1)) * (M2-mean(M2))) / (length(M1)-1) / (sd(M1)*sd(M2))
That is, you are using the obvious estimator for the definition
$$\rho = \frac{\text{Cov}(X,Y)}{\sigma_x\ \sigma_y}$$
This will produce the same as
cor(M1, M2, method = "spearman")
and also the same ascor(M1, M2, method = "pearson")
.The formula you posted gets into deep trouble when there are many ties, which is exactly the case in your dataset.