Solved – Linkage method for hierarchical clustering of binary data

clusteringhierarchical clusteringmachine learningstataunsupervised learning

I need to cluster datapoints that are represented as a binary vector, using hierarchical cluster.
I chose the manhattan distance and am trying to figure out how to choose the "best" linkage method. I heard that Ward should not be used since it relies on euclidean distance. Is it true? I have not found any "real scientific" references. Are complete/single linkage methods better? Can you give any articles or papers as references?

Thanks!

Best Answer

There is no universal "best". It's your choice.

For example, complete linkage may be nice, because it means any two instances have at not h bits different at height h.

Or you may want average linkage, so that the average number of bits is h.

Or you may want minimax linkage, so that there exists one object, where all others are at most h bits different.

No mathematical reason to prefer one over the other. They are all reasonable to use.