Solved – Is it possible to use Hellinger distance for environmental variables

classificationclusteringdistance-functionseuclidean

Here is the problem, Euclidean distance is not recommended for datasets with many zeroes (like matrices of species/site), as there is the risk of the abundance paradox (Orloci, 1978). Whereas to calculate environmental distance (i.e., using Temperature and Precipitation variables) the Euclidean distance is widely used. The problem is these are not easily comparable. Is it correct to use Hellinger distance on environmental variables (normal distribution)?

Best Answer

It is pretty easy to compare two dissimilarity matrices (assuming that is what you mean by compare?).

For example, you could ordinate the dissimilarity matrices separately and compare them with Procrustes rotation. Or there is the method of co-intertia analysis which extracts axes that maximise the covariance between the two data sets (cf PCA which extracts axes of maximal variance in the one data set) subject to axes being orthogonal. Co-inertia is based on Euclidean distances so you could apply the Hellinger transformation to the species data and leave the environmental data untransformed, or you might transform some of the env data using say a log transformation.

Mantel's (partial) test can also be used to compare associations between two or more dissimilarity matrices.