I'm a bit confused with concept of K-medoids.

It seems that original algorithm (PAM) describes that swap step should be performed by swaping only **one** of the medoids with one non-medoid point from the **whole dataset** (not only in cluster of that medoid), which will lead to smallest error in that moment.

First of all, is my understanding correct?

My original idea (which I implemented) swaps **all** of the medoids in one iteration, each medoid can be replaced by a point from **its cluster**, that leads to smallest error in its cluster. Is this a bad approach?

I was mislead by K-means, which calculates its centroid based only on the points from its cluster, but now I'm not sure if the same logic has sense for K-medoids.

Also, just to add, I implemented K-means++ like initialization, to optimize the procedure.

## Best Answer

K-Medoids is the problem of finding a set of K representative objects of the dataset such to minimize the total dissimilarity of each object from its nearest representer.

The PAM is an approximate solution to this problem that falls in the category of local search algorithms. It is made of two phases:

the BUILD step in which an initial solution is built. It starts by selecting as the first medoid the object with the lowest mean dissimilarity w.r.t. the whole dataset (i.e., the most central object). Iteratively medoids are selected as the objects that further minimize the overall dissimilarity of each object from its nearest medoid.

the SWAP phase in which given the current solution (the set of k medoids) all the neighbour solutions are evaluated. A neighbour solution is constructed by swapping one of the current medoids with one of the non-selected objects. Thus the size of the neighborhood of a given solution is $(n-k)*k$ where $n$ is the total number of objecs of the dataset and $k$ is the number of wanted medoids.

Now, to answer your questions:

It may be better or worse than PAM, you may compare them evaluating:

If memory serves me right there should also be a paper around about a k-means like k-medoids algorithm.