Machine Learning – Combining Multiple Similarity Measures in Hyperspectral Image Clustering

clusteringmachine learningsimilaritiesspatial

I have a hyperspectral image where the pixels are 21 channels.
So each pixel $\in \mathbb{R}^{21}$. I want to perform clustering on the pixels with similarity defined by two different measures, one how close the pixels are, and the other how similar the pixel values are.

Thus if $X_1$ and $X_2$ are the locations of pixels $p_1$ and $p_2$ I have:
$$S_X = \|X_1-X_2\|^2_2$$
and
$$S_p = \|p_1-p_2\|^2_2$$.

I have seen these measures combined into a single measure like this:
$$ S= e^{-\frac{S_p}{\sigma^2_p}} \times \,\, e^{-\frac{S_X}{\sigma^2_X}}
$$

My question: Is there a right way and a wrong way to combine measures like this, or if it improves my clustering can I combine the measures in any way that suits me?

My question is vaguely related to Combining multiple similarity measures.

Best Answer

Anony-Mousse is right. There is no universal optimal similarity measure and the benefit of each measure depends in the problem.

In order to evaluate the benefit of a similarity measure in a specific problem, I like to reduce it into a classification problem. Given the dataset of items you have create a new dataset of item pairs. The concept should be whether the two items in a pair are similar. Each similarity measure you have is a feature of the pair. Note that now you are in the good old classification framework. You can evaluate the similarity measures by computing the mutual information/accuracy/your chosen metric given the concept.