Solved – K-means and reproducibility

k-means

I'm trying to find the optimal K-means clustering for a set of elements. For a particular K, K-means repeated several times does not always converge to the same clustering due to the randomness in the initialization. This means that whatever internal performance statistic (a clustering criterion, such as Dunn's index) I'm using to choose k is going to be dependent somewhat on when I run the algorithm, unless I use a seed. To decide whether the clustering is "stable" for a particular k (making it possible for me to believe the performance statistic is representative), I calculate how often the resulting clusters for that k overlap from run-to-run:

$$ \frac{1}{n(r-1)} \sum_{j=2}^r \sum_{i=1}^n \frac{|C_{j-1}^{(i)} \cap C_{j}^{(i)}|}{|C_{j-1}^{(i)}|}$$

Where $n$ is the number of elements I'm clustering, $r$ is the number K-means runs (with fixed k), and $C_j^{(i)}$ is the cluster from the $j$th run containing the $i$th element.

However, this statistic is not from the literature, and I'd be curious to know how others have addressed this problem and if there are issues with my approach that haven't occurred to me. Thanks.

Best Answer

There are many statistics suitable for comparing the similarity of clusterings. Probably the most popular is the adjusted Rand index.

If you have multiple runs of k-means and the results have a high ARI (close to 1), then they are very similar. An ARI close to 0 would indicate very unstable results; but this is essentially impossible with k-means because it only generates convex clusterings which are bound to be rather similar (an ARI of 0 would arise on random cluster labels that do not take the data into account).

You may want to also look at the WCSS used by k-means. Results with a low ARI but a highly similar WCSS indicate that there a multiple equally good local minima. Then k may be too small. A high ARI similarity but very different WCSS must be caused by only a few points, probably outliers that disturb the algorithm?

Related Question