Solved – What clustering algorithm can be used with a distance matrix and without feaures

clustering

I have a dataset of binary files. I can't do feature extraction on them. I just computed the distance between every pair of file in the dataset with a distance metric (NCD = Normalized Compression Distance). So I have a distance matrix.

My goal is to cluster these files. What is the best way to do that?

Best Answer

Many, many algorithms are based on distances only:

hierarchical clustering, with most linkages (single-link etc.)
DBSCAN
OPTICS
PAM (Partitioning around Medoids, aka k-medoids)
Affinity propagation

Of course there are also a number of methods that need coordinates. In particular

Centroid-based methods such as k-means need coordinates to compute the centroid
Grid-based methods such as DENCLUE need coordinates to compute a grid

Best Answer

Related Solutions

Clustering – How to Perform Clustering with a Distance Matrix

k-medoids clustering

hierarchical clustering

Solved – K-medoids clustering with Gower distance in R

Related Question