Uniquely reconstruct a matrix $M$ from its inverse $M^{-1}$ if $n$ elements of $M^{-1}$ are unknown and $n$ elements of $M$ are given

linear algebramatricesnonnegative-matrices

This question was motivated by a recent MO post. You know $n$ elements of the $N\times N$ matrix $M$ and you do not know $n$ elements of the inverse $M^{-1}$ (but you know the other $N^2-n$ elements of $M^{-1}$). Equating $(M^{-1})^{-1}=M$ gives $n$ nonlinear equations in $n$ unknowns, which in general will have multiple solutions. Under which additional condition can one reconstruct the matrix $M$ uniquely? Does it matter where in the matrix are the $n$ elements located?

Conjecture: A positive definite $N\times N$ matrix is uniquely determined by its diagonal elements and by the off-diagonal elements of its inverse.

For $N=2$ it is true,$^\ast$ and some experimentation$^{\ast\ast}$ for larger $N$ suggests it is true for all $N$.

_{$^\ast$ For $N=2$ one has $M = \begin{pmatrix}a & b \\ b & c

\end{pmatrix}$, $M^{-1} = \frac{1}{a c – b^2} \begin{pmatrix}c & -b \\ -b & a

\end{pmatrix}$, we know $a,c$ and we know $\beta=b/(ac−b^2)$. There are two solutions for the unknown $b$, $b_\pm=(\pm\sqrt{4ac\beta^2+1}−1)/2\beta$, only $b_+$ gives a positive definite $M$.}

$^{\ast\ast}$ Mathematica test for $N=3,4,5$, when there are, respectively, up to $5,14,22$ solutions for the unknown matrix elements, but only one of these gives a positive definite $M$.

Best Answer

The conjecture is true. More precisely, here is what I will prove:

Theorem Partition $\{ (i,j) : 1 \leq i \leq j \leq n \}$ into two disjoint sets, $A \sqcup B$. Let $X$ and $Y$ be positive definite $n \times n$ matrices and suppose that $X_{ij} = Y_{ij}$ for $(i,j) \in A$ and $(X^{-1})_{ij} = (Y^{-1})_{ij}$ for $(i,j) \in B$. Then $X = Y$.

Lemma 1 Consider the function $f(M) = \log \det(M)$ on the space of $n \times n$ matrices $M$ with positive determinant. Then we can think of the gradient, $\nabla(f)$, as an $n \times n$ matrix as well. In this sense, we have $\nabla(f)|_M = (M^{-1})^T$.

Proof: Let $E_{ij}$ be the matrix with a $1$ in position $(i,j)$ and $0$'s everywhere else. Then $\nabla(f)_{ij} = \tfrac{d}{dt} \log \det(M+t E_{ij})$. This is $\tfrac{1}{\det(M)} \ \tfrac{d\ \det(M+t E_{ij})}{dt} = \tfrac{1}{\det(M)} \cdot (-1)^{i+j} \det \widehat{M_{ij}}$. Here $\widehat{M}_{ij}$ is the $(n-1) \times (n-1)$ matrix where we delete row $i$ and column $j$. Using the formula for the inverse matrix in terms of the adjugate matrix, we see that $\nabla(f)_{ij} = (M^{-1})_{ji}$. $\square$

Lemma 2 The function $f$ is strictly concave on the cone of positive definite matrices.

Proof Let $P$ be a positive definite matrix and let $Q$ be a symmetric matrix. We will show that $f(P+tQ)$ is concave for small $t$.

Since $P$ is positive definite, we can factor $P = ZZ^T$ for some invertible $Z$. Then $\log \det(P+Q t) = \log P + \log \det (\mathrm{Id} + t Z^{-1} Q (Z^{-1})^T)$. Since $Z^{-1} Q (Z^{-1})^T$ is symmetric, it is diagonalizable, say with eigenvalues $\lambda_1$, $\lambda_2$, ..., $\lambda_n$. So $\log \det (\mathrm{Id} + t Z^{-1} Q (Z^{-1})^T) = \sum \log (1+t \lambda_i)$. As long as $|t| < \min(|\lambda_i|^{-1})$, this is a sum of concave functions. $\square$

Now, let $A$, $B$, $X$ and $Y$ be as in the statement of the theorem. Let $V = \mathrm{Span}_{(i,j) \in B} (E_{ij}+E_{ji})$.

The condition that $X_{ij} = Y_{ij}$ for $(i,j) \in A$ means that $Y$ is in the affine space $X+V$.

From Lemma 1, the condition that $(X^{-1})_{ij} = (Y^{-1})_{ij}$ for $(i,j) \in B$ means that $f$, when restricted to the affine space $X+V$, has the same gradient at $X$ and at $Y$.

Now, the space of positive definite matrices is convex, so its intersection with the affine space $X+V$ is convex, so we can draw a line segment from $X$ to $Y$ and the line segment will stay in the space of positive definite matrices. By Lemma 2, our function $f$ restricted to this line segment will be strictly concave. But by our observation in the previous paragraph, this function has the same derivative at $X$ and at $Y$. This is only possible if the line segment has length $0$, so $X=Y$. $\square$

This was a uniqueness result. I will now prove an existence result. Let $A$ and $B$ be as above, and let $P$ and $Q$ be positive definite matrices. I will now show that there is positive definite matrix $X$ with $X_{ij} = P_{ij}$ for $(i,j) \in A$ and $(X^{-1})_{ij} = Q_{ij}$ for $(i,j) \in B$.

As above, let $V=\mathrm{Span}_{(i,j) \in B} (E_{ij}+E_{ji})$. Let $K$ be the intersection of the affine space $P+V$ with the cone of positive definite matrices. Let $g(X) = \log \det X - \mathrm{Tr}(QX)$. We will show that there is a point $X$ in $K$ where $\nabla(g)=0$. Since $X \in K$, the matrix $X$ is positive definite with $X_{ij} = P_{ij}$ for $(i,j) \in A$. By the computations in the first part, the equation $\nabla(g)=0$ shows that $(X^{-1})_{ij} = Q_{ij}$.

So we are reduced to the goal of showing that $g$ has a local maximum in $K$. Fortunately, $P \in K$, so $K$ is nonempty. Unfortunately, $K$ is not compact. So we need to replace it by a compact version. Let $\overline{K}$ be the set of positive semi-definite matrices in $P+V$. So $\overline{K}$ is closed, but not yet compact.

Lemma There is a constant $c>0$ such that, for all positive definite matrices $X$, we have $|X| \leq c \mathrm{Tr}(QX)$, where $|X|$ is the Frobenius norm.

Proof The statement of the Lemma is invariant under conjugating $Q$ and $X$ by an orthogonal matrix, so we may assume that $Q$ is diagonal, with diagonal entries $0 < q_1 < q_2 < \cdots < q_n$. So $$\mathrm{Tr}(QX)= \sum_i q_i X_{ii} \geq q_1 \sum_i X_{ii}.$$ We also have $$|X| = \sqrt{ \sum_{i,j} X_{ij}^2} \leq \sqrt{\sum_{i,j} X_{ii} X_{jj}} = \sum_i X_{ii}.$$ (We have used that $X$ is positive definite to deduce the Cauchy-Schwartz inequality $X_{ij}^2 \leq X_{ii} X_{jj}$.) So we can take $c=q_1$. $\square$.

Now, if the eigenvalues of $X$ are $\lambda_1$, $\lambda_2$, \dots, $\lambda_n$, then $\det(X) = \prod \lambda_i$ and $|X| = \sqrt{\sum \lambda_i^2}$ so we have $\det(X) \leq |X|^n$ and thus $\log \det(X) \leq n \log |X| \leq n (\log \mathrm{Tr}(QX)+\log c)$. So $g(X)$ is bounded above by $n \log \mathrm{Tr}(QX) - \mathrm{Tr}(QX) + n \log c$.

We have $\lim_{z \to \infty} n \log z - z=- \infty$ so, if $\mathrm{Tr}(QX) \to \infty$ then $g(X) \to - \infty$. Thus, we can choose some large $M$ such that the value of $g$ at $P$ is larger then its value on the half space $\{ X : \mathrm{Tr}(QX) \geq M \}$.

Now, consider the function $X$ on $\overline{K} \cap \{ X : \mathrm{Tr}(QX) \leq M \}$. This, finally, is a compact set, so $g$ achieves a maximum somewhere on it. The maximum is not at the positive semidefinite points, since $g$ is $-\infty$ there, and it is not on the hyperplane $\{\mathrm{Tr}(QX) = M \}$ by the choice of $M$. So the maximum is in the interior and we win. $\square$

UPDATE: I recently talked with Piotr Zwiernik and Thomas Kahle about this question, and they explained to me that this is a special case of a 1978 theorem of Barndorff-Nielsen which goes by the name "Variational Independence". From their perspective, this has nothing to do with matrices or inverses, it is solely a theorem of convex analysis. I'll do my best to explain the Variational Independence theorem. Thanks to Piotr and Thomas for their help; of course, all errors are mine.

Let $V$ be a finite dimensional real vector space, $C$ an open convex subset of $V$ and $f$ a $C^1$ strictly convex function $C \to \mathbb{R}$ with the property that $|\nabla(f)(x)| \to \infty$ as $x$ approaches any point in $\partial C$. Such a pair $(C,f)$ is called "of Legendre type".

In our application, $V$ will be symmetric matrices, $C$ will be positive definite symmetric matrices and $f$ will be $f(X) = - \log \det(X)$, so $\nabla(f)(X) = - X^{-1}$.

Let $V^{\ast}$ be the dual vector space to $V$, so $\nabla(f)$ takes values in $V^{\ast}$. Let $C^{\ast} = \nabla(f)(C)$. Then $C^{\ast}$ is an open convex set and $\nabla(f): C \to C^{\ast}$ is bijective. For $x^{\ast} \in C^{\ast}$, define $f^{\ast}(x^{\ast}) = \min_{x \in C} f(x) - \langle x^{\ast}, x \rangle$; this is known as the Legendre conjugate of $f$. Then the pair $(C^{\ast}, f^{\ast})$ is also of Legendre type, and $\nabla(f^{\ast})$ is the inverse bijection $C^{\ast} \to C$. See Chapter 5, Section 26, of Rockafellar for the technical details here.

Now, suppose that we have a short exact sequence of vector spaces $0 \to B \overset{\beta}{\longrightarrow} V \overset{\alpha}{\longrightarrow} A \to 0$, so that we get a dual sequence $0 \to A^{\ast} \overset{\alpha^{\ast}}{\longrightarrow} V^{\ast} \overset{\beta^{\ast}}{\longrightarrow} B^{\ast} \to 0$. In our setting; $A$ and $B$ will be the symmetric matrices supported on the coordinates indexed by the disjoint sets $A$ and $B$ (which is why I reused the notation), so $\alpha(X)$ and $\beta^{\ast} \nabla(f)(X)$ are the projections of $X$ and $X^{-1}$ onto the $A$ and $B$ coordinates respectively.

The "variational independence" theorem is that the map $$x \mapsto (\alpha(x), \beta^{\ast} \nabla(f)(x))$$ is always a bijection between $C$ and $\alpha(C) \times \beta(C^{\ast})$.

See Theorem 5.34 in Barndorff-Nielsen for the original proof, and see Section 5.1 in Zwiernik for a very accessible discussion, including the current application and related transformations.

Best Answer

Related Solutions

Related Question