[Math] Square root of positive definite nonsymmetric matrix

linear algebramatrices

Let $N$ be a nilpotent matrix in $M_n({\mathbb R})$, such that
$(I+N)^2$ is “positive definite” (but not necessarily symmetric) in the
sense that $<X,(I+N)^2X>$ is positive for any nonzero $X\in{{\mathbb R}^n}$
(here $<.,.>$ denotes the usual scalar product on ${\mathbb R}^n$).
Is it true that $I+N$ must always be “positive definite” also ?

UPDATE : To clarify the question, I know about http://mathworld.wolfram.com/PositiveDefiniteMatrix.html which makes many promising related statements, but without any proof. This webpage has a list of references at the bottom, but I’ve checked them one by one and was unable to find that special result.

I think this question is a rather special case of the properties of the "unique positive square root function" described in the link. I'm hoping for an “elementary” proof.

Best Answer

I believe $I+N$ must indeed be "positive definite".

Apply induction on the dimension. We are given a nilpotent matrix $N'$ which by Schur decomposition we can assume is strictly upper triangular. As a block matrix, $N'=\begin{pmatrix}N&x\\0&0\end{pmatrix}$ with $N$ a square strictly upper triangular matrix and $x$ a column vector. By assumption $(I+N')^2+(I+N')^{2t}$ is SPD (SPD=symmetric positive definite; superscript $2t$ means square and transpose). We wish to show that $(I+N')+(I+N')^t$ is SPD.

Let $U=I+N$ and note $$\begin{pmatrix}U&x\\0&1\end{pmatrix}^2+\begin{pmatrix}U&x\\0&1\end{pmatrix}^{2t}=\begin{pmatrix}U^2+U^{2t}&(I+U)x\\x^t(I+U)^t&2\end{pmatrix}.$$

By the Schur decomposition characterization of SPD we have by assumption:

  • $U^2+U^{2t}$ is SPD
  • $2-x^t(I+U)^t(U^2+U^{2t})^{-1}(I+U)x>0$

And we need to show:

  • $U+U^t$ is SPD
  • $2-x^t(U+U^t)^{-1}x>0.$

$U+U^t$ is SPD by induction. The latter inequality would follow from $$x^t(I+U)^t(U^2+U^{2t})^{-1}(I+U)x\geq x^t(U+U^t)^{-1}x$$ Letting $y=(I+U)x,$ this is $$y^t(U^2+U^{2t})^{-1}y\geq y^t(I+U^t)^{-1}(U+U^t)^{-1}(I+U)^{-1}y$$

This would follow from this result and symmetric positive definiteness of $$(I+U)(U+U^t)(I+U^t)-(U^2+U^{2t})=U+U^t+2UU^t+U(U+U^t)U^t$$ Since $U+U^t$ is SPD, so is $U(U+U^t)U^t,$ and $UU^t$ is certainly SPD. So we are done.


See loup blanc's answer for a generalization of the above argument from unipotent matrices to all matrices with positive spectrum.


Here is a reference with a nice proof: Uniqueness of matrix square roots and an application by Charles R. Johnson, Kazuyoshi Okubo, Robert Reams, Theorem 7. It uses the following theorem:

Theorem (Lyapunov). Let $A\in\mathbb C^{n\times n}$ (not necessarily Hermitian) and let $X\in\mathbb C^{n\times n}$ be Hermitian. If the eigenvalues of $A$ all have positive real part and $AX+XA^*$ is positive definite, then $X$ is positive definite.

In particular if $A$ has eigenvalues with positive real part and $A(A+A^*)+(A+A^*)A^*=A^2+(A^*)^2+2AA^*$ is positive definite, then $A+A^*$ is positive definite.

Proof 1 (sketch) following Horn and Johnson's Topics in Matrix Analysis:

Suppose not. We can take a kind of Jordan normal form, but making any above-diagonal $1$'s arbitrarily small, so $S^{-1}AS+(S^{-1}AS)^*$ is positive definite for some non-singular $S.$ Setting $G=SS^*$ we find that $AG+GA^*$ is positive definite. For $0\leq \theta\leq 1$ define $X_{\theta}=\theta G+(1-\theta)X.$ Note:

  • $X_0=X$ is not positive definite
  • $X_1=G$ is positive definite
  • $AX_\theta+X_\theta A^*$ is a convex combination of positive definite matrices so must be positive definite.

All the matrices $X_\theta$ have real eigenvalues, and by continuity some $X_\theta$ must have $0$ as an eigenvalue: $X_\theta v=0$ for some non-zero $v.$ This implies $v^*(AX_\theta+X_\theta A^*)v=0,$ contradicting positive definiteness of $AX_\theta+X_\theta A^*.$

Proof 2.

Consider the function defined by $f(W)=\int_0^\infty e^{-tA}We^{-tA^*}dt.$ We can compute

\begin{align*} W&=-\frac d{d\tau}\Bigr|_{\tau=0}\int_{\tau}^\infty e^{-tA}We^{-tA^*}dt\\ &=-\frac{d}{d\tau}\Bigr|_{\tau=0}\int_0^\infty e^{-\tau A}e^{-tA}We^{-tA^*}e^{-\tau A^*}dt\\ &=Af(W)+f(W)A^*. \end{align*}

This means $f$ is a left inverse of the map $X\mapsto AX+XA^*.$ Since $f$ is an $\mathbb R$-linear map from the space of Hermitian matrices to itself, and has a left inverse, it must be an isomorphism. So the solution $X$ of $AX+XA^*=W$ is unique. And if $W$ is positive definite, then so is $e^{-tA}W^{1/2}W^{1/2}e^{-tA^*},$ so $f(W)$ is positive definite by construction. This proves that if $AX+XA^*$ is positive definite then so is $X.$