Solved – decision-tree-like algorithm for unsupervised clustering

cartclusteringmachine learningr

I have a dataset consists of 5 features : A, B, C, D, E. They are all numeric values. Instead of doing a density-based clustering, what I want to do is to cluster the data in a decision-tree-like manner.

The approach I mean is something like this:

The algorithm may divide the data into X initial clusters based on feature C, i.e. the X clusters may have small C, medium C, large C and very large C values etc. Next, under each of the X cluster nodes, the algorithm further divide the data into Y clusters based on feature A. The algorithm continues until all the features are used.

The algorithm that I described above is like a decision-tree algorithm. But I need it for unsupervised clustering, instead of supervised classification.

My questions are the following:

  1. Do such algorithms already exists? What is the correct name for such algorithm
  2. Is there a R/python package/library which has an implementation of this kind of algorithms?

Best Answer

You may want to consider the following approach:

  • Use any clustering algorithm that is adequate for your data
  • Assume the resulting cluster are classes
  • Train a decision tree on the clusters

This will allow you to try different clustering algorithms, but you will get a decision tree approximation for each of them.