Solved – Which hierarchical clustering algorithm

clustering

I have a large distance matrix $3400\times 3400$.

I need to cluster them hierarchically and then cut the tree into clusters (like a partitional approach).

Which algorithm is most sensitive to finding natural clusters in the data based on the distance matrix?

How can I evaluate the result? I am planning on using average silhouette coefficient of the tree at various levels to identify the 'natural' clusters from the tree.

Thanks

Best Answer

Sounds like you need HAC (hierarchical agglomerative clustering). There are many variants, but the basic idea is that you start with singleton clusters and progressively merge, based on different ways of determining which clusters are the "closest".

For more on HAC, see the wikipedia entry.