Fitting a non-linear curve in symmetric positive definite matrix manifold

covariancecurvesdifferential-geometrymachine learningstatistics

I have a variable $x \in \mathbb{R}$. For some values of $x$, $\{x_1, …, x_n\}$, I have measured a covariance matrix of the variable $y \in \mathbb{R}^n$, conditional on these values of the variable $x$. That is, I have a set of covariance matrices $\{ \Sigma(x_1), …, \Sigma(x_n) \}$ that belong to $\mathbb{R}^{n \times n}$.

From visual inspection, it is seen that these covariance matrices vary smoothly with the value of $x$, with each entry drawing a smooth curve.

The problem I have is that I want to interpolate between the observed covariances $\{ \Sigma(x_1), …, \Sigma(x_n) \}$, to the covariances $\Sigma(x)$ at values of $x \in [x_1, x_n]$ not present in the dataset. I need the resulting interpolated matrices to also be covariance matrices (i.e. symmetric positive definite matrices, SPDM). For the interpolation, I want to take into account the form of the complex curve that the matrices $\Sigma(x)$ are seen to draw, and not just make an interpolation between adjacent pairs present in the dataset $\Sigma(x_k)$ and $\Sigma(x_{k+1})$.

This problem can be seen as a problem of fitting a non-linear curve to points on the SPDM manifold. This would guarantee that the resulting interpolated matrices are covariance matrices, and also the non-linear geometry of the manifold would be taken into account for the curve fitting. Note that this problem is different from just geodesic interpolation, which would not take into account the global shape of the curve drawn by the points in the SPDM manifold, but just interpolate between adjacent pairs $\Sigma(x_k)$ and $\Sigma(x_{k+1})$.

So, the question is if there are any ideas of how to fit a curve to some points in the symmetric positive definite matrices manifold. We can assume that the measured points are noiseless, or that they are noisy, any solution works.

[Edit:] After it being noted that the problem may not be clear, I edited the question to make it clear that geodesic interpolation is not what I am looking for. I also removed an incomplete/speculative idea from the initial post, since I found an answer that I posted below.

Best Answer

We have an observed set of covariance matrices $\Sigma(x) \in \mathbb{R}^{n \times n}$ parametrized by variable $x \in \mathbb{R}$, observed at a finite set of values of $x$, $\{x_1, x_2, ..., x_n\}$. We want to fit a curve to these matrices, in order to approximate the value of $\Sigma(x)$ for values of $x$ not in our dataset.

The challenge is that we want our interpolated matrices to be covariance matrices, or in other words, to be points in the Symmetric Positive Definite Matrices (SPDM) manifold. If we just apply standard regression techniques (e.g. Gaussian Process Regression or Spline Regression), the interpolated values are not guaranteed to be in the SPDM manifold. Also, this approach would not take into account the geometry of the SPDM manifold to measure distances.

A way to approach this problem is described in this tutorial of the differential geometry Python package geomstats, and seems to be a standard approach in the literature.

In short, the approach involves 1) mapping our data points, which lay on the SPDM manifold, to a vector space, 2) apply the standard regression toolkit (which works just fine in a vector space) 3) do the inverse mapping, from the interpolations performed in the vector space, to the original manifold, obtaining the interpolated SPDM points.

For 1), we want a mapping into a vector space that respects as much as possible the geometry of the points in the manifold. For this, as described in the tutorial, we first compute the Frechet mean of our datapoins $ \{ \Sigma(x_1), \Sigma(x_2), ..., \Sigma(x_n) \} $, which we call $\widehat{\Sigma}$. Then, we use the Logarithmic map (explained in section 5 here) at point $\widehat{\Sigma}$, or $Log_{\widehat{\Sigma}}: \mathcal{M} \to \mathcal{T}_{\widehat{\Sigma}}\mathcal{M}$, to map all the points in our dataset to the tangent space of the manifold $\mathcal{M}$ at the Frechet mean (i.e. $\mathcal{T}_{\widehat{\Sigma}}\mathcal{M}$).

Naming $v(x) = Log_{\widehat{\Sigma}}(\Sigma(x))$, we now have a set of vectors $\{ v(x_1), v(x_2), ..., v(x_n) \}$ representing the dataset in the vector space. Of note, these vectors are in the space of symmetric matrices (as described here). We can now 2) apply the standard regression toolkit to this new vector dataset, such as spline regression, or whatever the user deems best for the problem. After applying the regression, we are able to interpolate $v(x)$ for any value of $x \in [x_1, x_n ]$.

Finally, 3) we take any $v(x)$ and apply the exponential map at the Frechet mean, or $Exp_{\widehat{\Sigma}}: \mathcal{T}_{\widehat{\Sigma}}\mathcal{M} \to \mathcal{M}$, to obtain the interpolated covariance matrix $\Sigma(x) = Exp_{\widehat{\Sigma}}(v(x))$.

Thus, the above delineates a strategy to do regression on covariance matrices, while respecting their geometry, and guaranteeing that the interpolated matrices are also covariance matrices. Although possibly intimidating for the non-expert the manifold-related tools mentioned above are available in the geomstats package linked above, and easy to use, as shown in the tutorial (I'm not affiliated with geomstats whatsoever).

[Edit:] I'm not sure why my previous answer in plain english got taken down, since it seemed to give a clear solution to the problem I had (I'm the poster of the question). I edited the answer to include mathematical notation, in case that was the thing moderators found lacking.

Related Question