Solved – How to compute a measure of distance between sites with continuous variables

distance-functionsenvironmental-dataeuclideanr

I have a dataframe with each row being a different site (51 sites), and each column being mean values of a different continuous environmental variable (19 variables).

I am trying to calculate a measure of environmental similarity/dissimilarity by using a distance calculation between sites.

I would like to calculate either a standardized Euclidean distance or Mahalanobis distance.
I have managed to get them to work with both the distance function in the package ecodist, and the dist.quant() function in the package ade4 in [R].

E.g.

AusEnvDist <- distance(AusEnvNum, method="euclidean", sprange=NULL)

However my outputs are the same regardless of how the dataframe is organized (i.e., sites being in rows or columns) – I get an output matrix of $19\times19$ instead of $51\times51$ – i.e., it's not calculating the distance between sites, but between variables.
Any ideas on how to fix this?
Or a better method for getting a singular "environmental" value for each site?

Best Answer

I tried this and got different results (as expected) from the distances of a data frame and its transpose:

library(ade4)
x1 <- rnorm(10, 2, 1)
x2 <- rnorm(10,1,1)
dframe <- cbind(x1,x2)
dist1 <- dist.quant(dframe, 1, diag = TRUE, upper = TRUE)
dist1
dist2 <- dist.quant(t(dframe),1, diag = TRUE, upper = TRUE)
dist2

dist2 gives a single distance (between x1 and x2). dist1 gives a $10\times10$ matrix (since I put upper = TRUE and diagonal = TRUE)