Solved – Optimal number of clusters using K-Prototypes method in R

clusteringk-meansr

I am trying to cluster some big data by using the k-prototypes method. I am unable to use K-Means as I have both categorical and numeric data.
I have been using the package "clustMixType" and have been able to create clusters if I define what k value I want.
I want to find the optimal k value though and can't find anything on this online already.

Best Answer

As far as I know there's no generic optimal k.

It depends a lot on your dataset and your goal. A lower K would yield more fuzzy prototypes but would generalize better. There are always trade-offs

One way to pick K is to plot the data, and look at it. Even then you might want to try other values to see if they work better for your application.

Related Question