Solved – How k-means computes cluster centroids differently for each distance metric

clusteringdistanceeuclideank-means

K-means computes cluster centroids differently for each distance metric. I don't know why the way of computing the centroid is dependent of the distance measure.

I don't know how we compute the centroid for manhattan distance and its difference with the computing the centroid for euclidean distance?

Best Answer

K-means does not work with arbitrary distances, and was originally only formulated with squared errors.

It is for squared Euclidean distance and Bergman divergences.

K-means does not minimize Euclidean distances! It will run, and find an okayish (not obviously wrong) solution; but not even a local optimum. So the simple answer is: don't rely on k-means for other distances. It so not something you can easily fix.