Solved – Distance between random variables

correlationdistancedistributionsrandom variable

I am looking for distances between two random variables $X$ and $Y$, or practical estimates for measuring the distance between the i.i.d. observations $(X^1, \ldots, X^T)$ and $(Y^1,\ldots,Y^T)$.
I am aware of the divergences or statistical distances, but they focus on quantifying the dissimilarity in distribution which is sufficient when $X$ and $Y$ are independent, but fail to measure how "correlated" they are otherwise.
Any information is welcomed!

Best Answer

Here's a measure that seems to accord with your requirements for the case of monotonic relationships between $X$ and $Y$:

Let $X,Y$ be your sample vectors. Let $S(X,Y)$ be the Spearman Rank Correlation between these two vectors and let $KS(X,Y)$ be the Kolmorogov-Smirnov Statistic between the EDCF of $X$ and ECDF of $Y$.

We can construct the quantity $D(X,Y)=||S(X,Y)|-1|+KS(X,Y)$. Lets analyze the cases:

  1. If $Y=f(X)$ where $f()$ is monotonic, then $KS(X,Y)=0 and|S(X,Y)|=1$, so the $D(X,Y)=0$.
  2. If $X,Y$ both come from the same distribution but are uncorrelated, then $S(X,Y)\approx 0$ and $KS(X,Y)\approx 0$ for large enough samples, so $D(X,Y)\approx 1$
  3. If $X,Y$ are neither correlated nor from the same distribution then $D(X,Y) \in [0,2]$.

So, this is more of a "coefficient" or index than a distance, but maybe itll work for you.

Related Question