Different between k-center and max diameter clustering

circlesclusteringgeometry

Consider the clustering problem: Given a set of points, partition them into $k$ clusters such that the maximum diameter of all the clusters gets minimized. The definition of cluster diameter is given below:

  1. The diameter of a cluster is the diameter of the minimum enclosing circle of the points on that cluster. (k-center clustering)
  2. The diameter of a cluster is the maximum possible distance between any two points on that cluster. (max diameter clustering)

Based on the definition of diameter, the clustering problem can be seen as two different clustering problems. I am trying to understand whether these two problems are the same or there are differences between them.

Note that: for every finite set of points with geometric span (maximum possible distance between any pair of points) $d$ has an enclosing circle with a radius no greater than $d/\sqrt{3}$.

Best Answer

These two clustering rules do not necessarily produce the same results

Suppose you have the following four points and you want $k=2$ clusters:

  • $A: (33,0)$
  • $B: (-33,0)$
  • $C: (0,56)$
  • $D: (0,124)$

What would the two clusters be?

With method $1$ you would cluster $\{A,B\}$ with a circle of diameter $66$ centred at $(0,0)$ and $\{C,D\}$ with circle of diameter $68$ centred at $(0,90)$, making the maximum diameter $68$. You might see that you are not going to do better than this by considering circles centred for example at the point $(0,18)$ near the centre of the smallest circle containing $A,B,C$ though that centre is more than $35$ away from all four points making any circle containing at least one of them have a diameter over $70$

With method $2$ you would cluster $\{A,B,C\}$ since they are all $65$ or $66$ away from each other, leaving $\{D\}$ for the other cluster since $D$ is at least $68$ away from any of the others

Related Question