Solved – Clustering Data Using Gower and Kmeans

clusteringgower-similarityk-meansmachine learning

I am trying to do clustering on my data which consists of both categorical and continuous variables. I have some questions which I would like to ask:

  1. I am going to use the Gower Distance measure to find the similarties/dissimilarties between data points is that ok?

  2. Can I use K-Means clustering for mixed variables to perform clustering? If not I will use Two-Step Clustering but can Two-Step Clustering be performed in R? Also, if so which Hierarchical Algorithm will I have to use?

Thanks

Best Answer

K-means can only be used in data sets where you can compute the arithmetic mean.

Use hierarchical clustering instead. It can use distance matrixes, including Gower distances.