I am researching cluster analysis, and I am interested in variables that are both categorical and continuous, for which I have read that a Gower's similarity coefficient is a good proximity measure. I have read that Gower's similarity coefficient is generally not compatible with Ward's method, so I was planning to initially cluster using average linkage, but I was also seeking to compare the cluster structure (for content validity purposes) with another clustering method, specifically the k-means method, using the number of clusters and initial centers obtained in the average linkage method. Is Gower's coefficient of similarity a compatible proximity measure for k-means method?
Solved – In cluster analysis, can you use Gower’s coefficient of similarity with a k-means clustering method
clusteringgower-similarityk-meansmixed type data
Related Question
- Solved – Determining number of clusters with SSE scree plot with Gower’s coefficient of similarity
- Solved – Follow up of cluster analysis with membership prediction
- Solved – How would PCA help with a k-means clustering analysis
- Solved – Why not terminating the k-means clustering algorithm after one iteration
Best Answer
K-means is really only sensible for squared euclidean distance.
The objective function of the two steps must agree for the algorithm to always converge.
Recomputing the mean optimizes the sum-of-squares assignment (the mean is the least squares estimator!). Therefore, the distance function must optimize the same objective, unless you also compute the mean differently.
And last but not least, when you are using Gower that somewhat implies that you have categorical attributes. How would you compute a mean/centroid there, in the first place?