[Math] Cluster algorithm with max range and variable point size

algorithmsclustering

Let's take the simplest example: I have a non-defined amount of points with x- and y-coordinates and want to cluster them under following premises:

  1. A point can only be part of one cluster.
  2. The clusters are restricted to a maximum (predefined) radius (not amount of points).
  3. Optimization: As few clusters as possible. As little noise as possible.

Does someone know an appropriate algorithm ?

Best Answer

Hierarchical clustering could be made to do this for you. http://en.wikipedia.org/wiki/Hierarchical_clustering

The idea is that when points are close enough together, you merge them into a cluster. Then you measure distances between clusters. If two clusters are close enough, you merge them. You just need to be careful as to which metric you use to measure the distance between clusters. In order to guarantee that you have a max pre-defined radius, you'll want the distance between two clusters to be the maximum distance between any two points in the cluster.

The way the algorithm is defined it keeps merging clusters until there is only one. Then you can back up to however many clusters you want. You can modify this so that you only merge as long as the distance is smaller than your pre-defined max radius. Once all the distances between clusters are too big, you stop merging.