Solved – Are there any algorithms that give global optimum for K-Means

algorithmsclusteringk-meansmachine learning

The performance function of K-Means is minimum distance form the observations to the centroid of the closet cluster. For ideal solution we must find the real centroid of each cluster, but in ordinary K-Means algorithm the centroids will be approximated randomly and then iterately refined until converge to local optimum( that sometimes yields poor result) .
I wonder that are there any method to find the real centroid to yield the global optimum (in the case that time complexity is not concerned).

Best Answer

Yes. A trivial example would be brute force search. Simply try all possible assignments of data points to clusters, and pick the assignment that minimizes the cost function. For a slightly improved (but still grossly impractical) version of brute force search, do it in a way that respects the symmetries of the problem. For example, if the same points are assigned to a cluster, it doesn't matter whether this is called cluster "1" or cluster "2", so there's no need to test both. More efficient methods from combinatorial optimization can probably be used, like branch and bound.

But, computational complexity can't be ignored in practice. The k-means problem is NP hard, so an algorithm for finding the global optimum in a feasible amount of time in the general case probably can't exist.