Solved – How does the Proxy::dist package for r compute cross-distance matrix between two matrices

distancedistance-functionsmatrixr

I am trying to understand how a cross-distance matrix between two matrices is computed. Can anyone help?

Maybe a simple example would help, two matrices having nrow observations of ncol variables (for example two input streams (col_1, col_2) belonging to two time series (a,b) of different lengths):

a <- cbind(c(1,2,3,4,5,6,7,8,9,10),c(1,2,1,2,1,3,4,2,3,4))
b <- cbind(c(1,2,3,2,5,2,5,8),c(2,3,1,5,1,3,4,2))
proxy::dist(a,b)

Produces the following cross distance matrix:

     [,1]     [,2]     [,3]     [,4]     [,5]     [,6]     [,7]     [,8]    
 [1,] 1.000000 2.236068 2.000000 4.123106 4.000000 2.236068 5.000000 7.071068
 [2,] 1.000000 1.000000 1.414214 3.000000 3.162278 1.000000 3.605551 6.000000
 [3,] 2.236068 2.236068 0.000000 4.123106 2.000000 2.236068 3.605551 5.099020
 [4,] 3.000000 2.236068 1.414214 3.605551 1.414214 2.236068 2.236068 4.000000
 [5,] 4.123106 3.605551 2.000000 5.000000 0.000000 3.605551 3.000000 3.162278
 [6,] 5.099020 4.000000 3.605551 4.472136 2.236068 4.000000 1.414214 2.236068
 [7,] 6.324555 5.099020 5.000000 5.099020 3.605551 5.099020 2.000000 2.236068
 [8,] 7.000000 6.082763 5.099020 6.708204 3.162278 6.082763 3.605551 0.000000
 [9,] 8.062258 7.000000 6.324555 7.280110 4.472136 7.000000 4.123106 1.414214
[10,] 9.219544 8.062258 7.615773 8.062258 5.830952 8.062258 5.000000 2.828427

Best Answer

If anyone is interested it computes the cross-correlation matrix the following way:

$$ C(i,j) = \sqrt{\sum_{k=1}^{ncol}\big(a(i,k)-b(j,k)\big){^2}} $$

In R code would be something like this (although not optimised as some values are computed for both upper and lower diagonal):

a <- cbind(c(1,2,3,4,5,6,7,8,9,10),c(1,2,1,2,1,3,4,2,3,4))
b <- cbind(c(1,2,3,2,5,2,5,8),c(2,3,1,5,1,3,4,2))

mydist <- function(a,b){
  la <- nrow(a)
  lb <- nrow(b)
  csscrrm <- matrix(nrow=la,ncol=lb)
  for(i in 1:la)
    for(j in 1:lb)
      csscrrm[i,j] <- sqrt(sum((a[i,] - b[j,])^2))
  return(csscrrm)
}

# Testing equality between results
mcrrmtx <- mydist(a,b)
pcrrmtx <- proxy::dist(a,b)
mcrrmtx == pcrrmtx