Solved – K-means gives non-spherical clusters

clusteringk-means

I am trying to cluster 24 month utilization behaviors of customers using sklearn/K-means in python. When I plot the customers by clusters in a 2-D space (Principal Components 1 and 2 of my 24-point values), I see some non-circular shapes too. There are a few points which definitely seem to be closer to another neighboring cluster than the one it has been assigned to. This despite K-means achieving convergence (algorithm stops before reaching max_iter). Can anyone explain if:

Non circular clusters can be expected in a 2-D representation of K-means?
How to explain examples in the plot where it seems customer should have been assigned to another cluster based on distance even though k-means is converging?

Best Answer

1) K-means always forms a Voronoi partition of the space. Thus it is normal that clusters are not circular.
2) K-means is not optimal so yes it is possible to get such final suboptimal partition. You will get different final centroids depending on the position of the initial ones.

Best Answer

Related Solutions

Solved – Spherical K-means Clustering in R

Solved – K-means initial centers membership

Related Question