Solved – use Manhattan distance on binary data for hierarchical clustering

binary dataclusteringdistancedistance-functionsjaccard-similarity

I understand that classically Jaccard and Hamming work best with binary data, but is there anything specifically wrong with using a Manhattan distance instead with the complete linkage function?

Best Answer

No, there's nothing inherently incorrect about doing that. In fact, for binary data, the Manhattan distance and Hamming distance are equivalent. For each variable the distance contribution is either 0 or 1; these contributions are summed over all variables. The Hamming distance explicitly sets these contributions to 0 or 1 as match/mismatch. The Manhattan distance works out to be the same because the underlying data is binary and so the only possible Manhattan distances between two values are 0 and 1.