Solved – Difference between K-medoids and PAM

clusteringk medoids

I understood that PAM is just one kind of K-medoids algorithm. The difference is in new medoid selection (per iteration):

  • K-medoids selects object that is closest to the medoid as a next medoid

  • PAM tries out all of the objects in the cluster as a new medoid that will lead to lower SSE.

If I understood well, PAM gives better results, but takes up much more time. Is that so?

Which one is better and why?


Here is what confused me, this is a list of software that implements K-medoids, from Wikipedia

  • ELKI includes several k-means variants, including k-medoids and PAM.
  • Julia contains a k-medoid implementation in the Clustering package[5]
  • R includes in the "flexclust" package variants of k-means and in the "cluster" package.
  • Gap An embrional open source library on distance based clustering.
  • Java-ML. Includes a k-medoid implementation.

For example, it says that ELKI contains both variants, k-medoids and PAM?

And for example first look on K-medoids implementation in javaml looks like it finds the object closest to medoid and tries it out.

Best Answer

k-medoids is the problem specification. It may be a np-hard problem.

PAM is one algorithm to find a local minimum for the k-medoids problem. Maybe not the optimum, but faster than exhaustive search.

PAM is to k-medoids as Lloyd's algorithm is to k-means. Lloyd's algorithm is a fast heuristic to find a good solution to k-means, but it may fail to find the best.