Solved – Cluster with distance threshold in R

clusteringdistancerthreshold

I'd like to get clusters with a maximum inner distance threshold.

Now I use hc <- hclust(d) and cutree(hc, numofclasses).

But I would like to use something like in python:

>>> cl = HierarchicalClustering(data, lambda x,y: abs(x-y))
>>> cl.getlevel(10)     # get clusters of items closer than 10

How can I get this in R?

Best Answer

You can use the h argument in cutree(). It will split the elements in clusters based on the "height" of the dendrogram. I will sketch the code, because you do not provide data.

hc <- hclust(data)
plot(hc); text(hc)       ## see the output
hc2 <- cutree(hc, h=10)

The height of the dendrogram corresponds to the distance between two nodes, but the relationship may not be 1-to-1 with your python implementation.