Solved – How to mathematically prove that k-means clustering converges to minimum squared error

algorithmsclusteringk-means

I am using k-means clustering to analyze and obtain patterns in traffic data. This well-known algorithm performs 2 steps per iteration.

  1. Assign each object to a cluster closest to it, based on the euclidean distance between the object and a cluster's center (obtained at a previous iteration).
  2. Update the centers of the clusters based on the new members of clusters.

I understand that k-means clustering overall aims to minimize the sum of the square errors. However, how can we mathematically prove that the 2 steps eventually converges to this goal? It sounds like a complicated optimization problem involving Lagrange multipliers, but maybe there is a more straightforward way to prove it?

Best Answer

  1. There is no k-means algorithm. K-means is the problem. Algorithms for it include MacQueen, Lloyd/Forgy, Hartigan/Wong and many more.

  2. Most of these algorithms (all but exhaustive search I guess) will only find a local optimum, not the global optimum. They are heuristics. Fast heuristics...

  3. It turns out the usual nearest-center heuristic sometimes even misses the local optimum, if cluster sizes vary much. Not by much. But rarely, points should not be assigned the nearest center.

  4. Global optimium search is IIRC NP-hard, so you do not want to use a perfect algorithm, unless you assume that P=NP.