Solved – How to cluster data according to Hamming distance

binary dataclusteringpython

I've a list of binary strings and I'd like to cluster them in Python, using Hamming distance as metric. I also would like to set the number of centroids (i.e. clusters) to create.

Which algorithm do you suggest?

Best Answer

The obvious first thing to try is hierarchical clustering.

Because it can use an arbitrary distance matrix.

Why don't you just try some? That's faster than asking on a web site...