MDS – Maximum Number of Dimensions in Multidimensional Scaling

dimensionality reductiondistancemachine learningmultidimensional scalingpca

If I have an arbitrary Euclidean distance matrix $D=(d_{ij}:i=1,\ldots, I; j=1,\ldots, I)$ and I want to reconstruct its elements (pairwise Euclidean distances) via classical Euclidean MDS. That is finding (a possible solution) $x=(x_{ih}:i=1,\dots,I; h=1,\dots,H)$ such that
$$
d_{ij} = \sqrt{\sum_{h=1}^{H} (x_{ih}-x_{jh})^2}.
$$

What is the minimal number $H$ (as a function of $I$) that entails reconstructing exactly any matrix $D$?

My attempt and comments:

  • The solution $x$ is not unique because we can transform $x$ according to transformations in the Euclidean group (translations, rotations, and reflections) and still have the same distance matrix. However, I am not interested in a unique solution, I am interested in the existence of a solution.
  • Since $D$ is a distance matrix it is a square symmetric matrix with diagonal elements equal to $0$, it has $I(I-1)/2$ elements that can vary.
  • Intuitively $H$ must be smaller or equal to $I$, but I am not sure if $H=I$ is the answer since $x$ has $I H$ elements that can be greater or equal than $I(I-1)/2$ also for some $H<I$.

Best Answer

Suppose $\mathbb D = (d_{ij})$ is an $n\times n$ distance matrix: that is, it is the matrix representing all distances for some set of $n$ points $\mathbf x_i$ in some Euclidean space $E^N$ (of arbitrarily high dimension $N$) for which $d_{ij} = |\mathbf x_i - \mathbf x_j|$ for all $1 \le i \le j \le n.$

$n-1$ Suffices

Setting $\mathbf y_j = \mathbf x_j - \mathbf x_n$ for $j=1,2,\ldots, n$ yields the zero vector $\mathbf y_n = \mathbf 0$ together with at most $n-1$ distinct $\mathbf y_i.$ The latter span a Euclidean subspace of $E^N$ of some dimension $d \le n-1,$ which we may identify with $E^d \subseteq E^{n-1} \subseteq E^N.$ This shows that

Given an $n\times n$ distance matrix $\mathbb D = (d_{ij}),$ there always exists a sequence of $n$ points $\mathbf x_i\in E^{n-1}$ for which $|\mathbf x_i - \mathbf x_j| = d_{ij}$ for all $1\le i \le j \le n.$

$n-1$ Is Required

The standard $n-1$-simplex in $\mathbb{R}^n$ has $n$ vertices $\mathbf x_i = (0,\ldots,0,1,0,\ldots,0)$ for $i=1,2,\ldots, n.$ Their mutual squared distances therefore all equal $2,$ implying any three of them form an equilateral triangle. Thus (using elementary 2D geometry) the inner product of any two distinct $\mathbf y_i = \mathbf x_i - \mathbf x_n$ (for $i = 1,2, \ldots, n-1$) is

$$Q(\mathbf y_i, \mathbf y_j) = \mathbf y_i \cdot \mathbf y_j = 1,\ i\ne j.$$

Now suppose that the only thing we know concerning $n$ points $\mathbf{x}_i$ in some Euclidean space $E^N$ is that all their mutual squared distances are $2.$ (Conceivably there are other configurations with this property besides the $n-1$-simplex. The preceding paragraph establishes that such sets of points exist, but it does not demonstrate that $n-1$ dimensions are required to represent them.)

Another way to describe this abstract situation is that the matrix representing the Euclidean inner product in terms of the $\mathbf y_i,$ $i=1,2,\ldots, n-1$ is

$$\mathbb Q_{n-1} = \pmatrix{2 & 1 & 1 & \cdots &1 \\1 & 2 & 1 & \cdots &1\\ \vdots & \vdots & \ddots &\cdots & \vdots\\1 & 1 & \cdots & 2 & 1\\1 & 1 & \cdots & 1 & 2} = \mathbb{I}_{n-1} + \mathbf{1}_{n-1}\mathbf{1}_{n-1}^\prime $$

In general, any such inner product matrix could be degenerate, meaning that fewer than $n-1$ dimensions are needed for the points it describes. However, the form of the right hand side makes it clear that $\mathbf Q_{n-1}$ has an eigenspace generated by $\mathbf{1}_{n-1} = (1,1,\ldots, 1)^\prime$ of eigenvalue $n,$ because $$\mathbb{Q}_{n-1} \mathbf{1}_{n-1} = \mathbf{1}_{n-1} + \mathbf{1}_{n-1}\mathbf{1}_{n-1}^\prime\mathbf{1}_{n-1} = n\mathbf{1}_{n-1},$$ together with a complementary eigenspace with eigenvalue $1$. Since all eigenvalues are nonzero, $\mathbb{Q}_{n-1}$ is nondegenerate, proving the $\mathbf y_i$ span a vector space of dimension $n-1.$ This shows that at least $n-1$ dimensions are needed for an isometric embedding of the $n-1$-simplex, and we have the example we need.

Related Question