Euclidean Distance – Distribution of Euclidean Distance Between Two Normally Distributed Random Variables

distance-functionsnormal distribution

Assume you are given two objects whose exact locations are unknown, but are distributed according to normal distributions with known parameters (e.g. $a \sim N(m, s)$ and $b \sim N(v, t))$. We can assume these are both bivariate normals, such that the positions are described by a distribution over $(x,y)$ coordinates (i.e. $m$ and $v$ are vectors containing the expected $(x,y)$ coordinates for $a$ and $b$ respectively). We will also assume the objects are independent.

Does anyone know if the distribution of the squared Euclidean distance between these two objects is a known parametric distribution? Or how to derive the PDF / CDF for this function analytically?

Best Answer

The answer to this question can be found in the book Quadratic forms in random variables by Mathai and Provost (1992, Marcel Dekker, Inc.).

As the comments clarify, you need to find the distribution of $Q = z_1^2 + z_2^2$ where $z = a - b$ follows a bivariate normal distribution with mean $\mu$ and covariance matrix $\Sigma$. This is a quadratic form in the bivariate random variable $z$.

Briefly, one nice general result for the $p$-dimensional case where $z \sim N_p(\mu, \Sigma)$ and
$$Q = \sum_{j=1}^p z_j^2$$ is that the moment generating function is $$E(e^{tQ}) = e^{t \sum_{j=1}^p \frac{b_j^2 \lambda_j}{1-2t\lambda_j}}\prod_{j=1}^p (1-2t\lambda_j)^{-1/2}$$ where $\lambda_1, \ldots, \lambda_p$ are the eigenvalues of $\Sigma$ and $b$ is a linear function of $\mu$. See Theorem 3.2a.2 (page 42) in the book cited above (we assume here that $\Sigma$ is non-singular). Another useful representation is 3.1a.1 (page 29) $$Q = \sum_{j=1}^p \lambda_j(u_j + b_j)^2$$ where $u_1, \ldots, u_p$ are i.i.d. $N(0, 1)$.

The entire Chapter 4 in the book is devoted to the representation and computation of densities and distribution functions, which is not at all trivial. I am only superficially familiar with the book, but my impression is that all the general representations are in terms of infinite series expansions.

So in a certain way the answer to the question is, yes, the distribution of the squared euclidean distance between two bivariate normal vectors belongs to a known (and well studied) class of distributions parametrized by the four parameters $\lambda_1, \lambda_2 > 0$ and $b_1, b_2 \in \mathbb{R}$. However, I am pretty sure you won't find this distribution in your standard textbooks.

Note, moreover, that $a$ and $b$ do not need to be independent. Joint normality is enough (which is automatic if they are independent and each normal), then the difference $a-b$ follows a normal distribution.

Related Question