K-means is really *only* sensible for squared euclidean distance.

The objective function of the two steps must agree for the algorithm to always converge.

**Recomputing the mean optimizes the sum-of-squares assignment** (the mean is the least squares estimator!). Therefore, the distance function *must* optimize the same objective, unless you also compute the mean differently.

And last but not least, when you are using Gower that somewhat implies that you have categorical attributes. *How would you compute a mean/centroid there, in the first place?*

**It depends on what you mean by "did pretty well" and on the population.** For general adult populations in the developed world I would not expect this to work very well: heights and weights alone are not great at distinguishing the genders.

**The best and easiest way to assess the situation** is to make a scatterplot of height and weight, distinguishing the point symbols by gender. This one is from the (US) NHANES 2011-2012 data, where I have removed data for anyone younger than 18 years. Note the logarithmic scales, which render each point cloud approximately oval in shape. (You may guess which kind of symbol--solid red or open blue--corresponds to which gender.)

The substantial overlap between the clouds for the two genders (between 160 and 170 centimeters, approximately) shows that no cluster analysis based solely on height and weight could possibly do a very good job discriminating men from women. The partial lack of overlap, revealed by the cloud of blue above 180 cm and cloud of red below 150 cm, shows that a clustering result would nevertheless have some discriminating power. Whether this would be good enough depends on your objectives and standards for predictive accuracy.

If, in your dataset, the two clouds appear to have little or no overlap, then not only can you expect a cluster analysis (like K-means) to work well, you can already *see* where the cluster centers should be and where a dividing line ("linear discriminator") would approximately be located.

Here are two k-means solutions for these data: one based on the logarithms and another based on separately standardized heights and weights. The two clusters are distinguished by the lightness of the symbols.

(The number of cases shown in these plots is 90 less than the number reported in the first figure due to missing values, which should originally have been excluded.)

Evidently in both cases the clusters, although associated with gender, fail to separate the two colors very well. The better-looking solution, based on the standardized data, yields these cross-tabulation statistics of cluster and gender:

```
Cluster
Gender 1 2
Male 1951 786
Female 586 2202
```

29% of all males and 21% of all females are mis-classified.

## Best Answer

One way to assign a weight to a variable is by changing its scale. The trick works for the clustering algorithms you mention, viz. k-means, weighted-average linkage and average-linkage.

Kaufman, Leonard, and Peter J. Rousseeuw. "Finding groups in data: An introduction to cluster analysis." (2005) - page 11:

Abrahamowicz, M. (1985), The use of non-numerical a pnon information for measuring dissimilarities, paper presented at the Fourth European Meeting of the Psychometric Society and the Classification Societies, 2-5 July, Cambridge (UK).

Friedman, H. P., and Rubin, J. (1967), On some invariant criteria for grouping data. J . Amer. Statist. ASSOC6.,2 , 1159-1178.

Hardy, A., and Rasson, J. P. (1982), Une nouvelle approche des problemes de classification automatique, Statist. Anal. Donnies, 7, 41-56.