Solved – R – how to transform the similarity matrix to distance matrix for performing hierarchical clustering

clusteringhierarchical clusteringr

I am trying to cluster nodes (C1, C2, C3…) of a graph using hclust and my similarity metric is number of links between nodes.

I have data like

c = matrix( c(0,1,3,1,0,5,3,5,0), nrow=3, ncol=3)

Basically this is a similarity matrix

    C1  C2  C3
C1  0   1   3
C2  1   0   5
C3  3   5   0

This is an undirected graph where similarity between C1 and C3 is 3 links. I need to transform this data to a suitable dist.matrix like

    C1  C2
C2  1
C3  1/3   1/5

format based on my similarity metric (#links between two nodes). How do I do this?

Best Answer

It looks like you just want your distances to be 1/c.

The print method for a distance matrix prints it in the format you want.

Which suggests as.dist(1/c).

> as.dist(1/c)
          1         2
2 1.0000000          
3 0.3333333 0.2000000

Is that what you're after?

If you want the diagonal distances to be zero, then you might replace 1/c there with (say) ifelse(c==0,0,1/c). It should still print the same.

Related Question