[Math] Matrix algebra: The “magical inverse” trick

inverselinear algebramatricesmatrix equations

In relation to my Master' Thesis, I made an observation that has bothered me for some time now, and I come here hoping that some of you guys can shed some light on it.

In my project I work with a complex block matrix on the form

$$\begin{equation}
B = \begin{bmatrix}
u & v \\
w & x
\end{bmatrix}
\end{equation},$$

where the four submatrices $u$, $v$, $w$ and $x$ are complex square matrices of the same size (and thus $B$ itself is also a square matrix). These submatrices fulfill the following three relations:

$$\begin{align}
& uu^\dagger – vv^\dagger = I \tag{1}\\
& xx^\dagger-ww^\dagger = I \tag{2}\\
& uw^\dagger = vx^\dagger \tag{3}
\end{align}$$

Here, I use dagger to represent the conjugate transpose.

Now using just the three equations above, we can construct the following block matrix equation:
$$\begin{equation}
\begin{bmatrix}
u & v\\
w & x
\end{bmatrix}
\begin{bmatrix}
u^\dagger & -w^\dagger\\
-v^\dagger & x^\dagger
\end{bmatrix}
=
\begin{bmatrix}
I & 0\\
0 & I
\end{bmatrix}
\tag{a}
\end{equation}$$

The only thing here slightly non-trivial is the lower left equation, which reads
$$wu^\dagger = xv^\dagger \tag{4},$$
but this is quickly seen to be the conjugate transpose of equation (3) above.

Equation (a) implies that our $B$ has an inverse, namely
$$B^{-1} =
\begin{bmatrix}
u^\dagger & -w^\dagger\\
-v^\dagger & x^\dagger
\end{bmatrix}.$$

And because square matrices have equal right and left inverses, we then get the following additional block matrix equation:
$$\begin{equation}
\begin{bmatrix}
u^\dagger & -w^\dagger\\
-v^\dagger & x^\dagger
\end{bmatrix}
\begin{bmatrix}
u & v\\
w & x
\end{bmatrix}
=
\begin{bmatrix}
I & 0\\
0 & I
\end{bmatrix}
\tag{b}
\end{equation}$$

Deconstructing this, we suddenly end up with four new equations:
$$\begin{align}
& u^\dagger u – w^\dagger w = I \tag{5}\\
& x^\dagger x – v^\dagger v = I \tag{6}\\
& u^\dagger v = w^\dagger x \tag{7}\\
& v^\dagger u = x^\dagger w \tag{8}
\end{align}$$

The equations (5) to (8) followed directly from the equations (1) to (4), through the trick with the inverse of $B$. So far, so good.

My problem here is that I can't find any way through normal operations to get the equations (5) to (8) from the equations (1) to (4)!

It seems very odd to me (and to people I've spoken to) that we need to introduce these large block matrices $B$ and $B^{-1}$ to transfer between the two equivalent sets of matrix equations. Shouldn't I also be able to do this just by performing "normal" algebraic operations such as addition, subtraction and matrix multiplication, as well as maybe complex conjugation?

Or should this "magical inverse trick" be considered a natural part of your toolbox when handling matrix equations? I understand that matrix algebra behaves differently than normal scalar algebra, so it might very well be the case – I have just never heard of anything like it before.

If anyone could give me some insight on this, it would save the day for both me and many of my fellow students! Thanks.

EDIT: Thanks for your replies! It is now clear to me that the inverse trick $AB = I \Longleftrightarrow BA = I$ is a natural part of the matrix algebra toolbox, because as Omnomnomnom points out under, there is no way of proving this equivalence through normal matrix algebraic means.

Still, to me, there is something else going on here as well. In order to get from equations (1)–(4) to equations (5)–(8) we actually construct the larger block matrices $B$ and $B^{-1}$ and do the inverse trick on them – so in some sense, we use the inverse trick "in a higher dimension" than the dimension of the original equations. I find it very strange that the equations cannot be rewritten just through matrix equations in the same dimension.

Or do you think there is a way of getting from equations (1)–(4) to equations (5)–(8) without constructing $B$ and $B^{-1}$, but just by somehow using the inverse trick on the original equations?

Best Answer

There is a way to get from (1)-(3) to (5)-(7) without the trick of assembling the matrices $u,v,w,x$ into a block matrix.

The special ingredient is the identity $$ (1+rs)^{-1} = 1 - r(1+sr)^{-1}s, \tag{*} $$ which is easy to verify. Admittedly, this too may seem "magical" if you're unfamiliar with it. At least, though, it's a universal kind of magic, in the sense that it's an identity in all rings in general, and not just for matrices. (For a reference about this identity, see mathoverflow.net/questions/31595, where it appears with a tiny difference in sign.)

Using this identity, we will derive equation (5) from (1)-(3). First, note that $u$ and $x$ are invertible. (This follows from (1) and (2), although I'm glossing over the details.) Now write (3) as $w^\dagger (x^{-1})^\dagger = u^{-1} v$. Multiply that equation by its conjugate transpose $x^{-1} w = v^\dagger (u^{-1})^\dagger$ to get $$ w^\dagger (x^{-1})^\dagger x^{-1} w = u^{-1} v v^\dagger (u^{-1})^\dagger. $$ The right-hand side, applying (1), is \begin{align} u^{-1} v v^\dagger (u^{-1})^\dagger &= u^{-1} (u u^\dagger - I) (u^{-1})^\dagger \\ &= I - u^{-1} (u^{-1})^\dagger \\ &= I - (u^\dagger u)^{-1}, \end{align} while the left-hand side, applying (2) and the identity (*), is \begin{align} w^\dagger (x^{-1})^\dagger x^{-1} w &= w^\dagger (x x^\dagger)^{-1} w \\ &= w^\dagger (I + w w^\dagger)^{-1} w \\ &= I - (I + w^\dagger w)^{-1}. \end{align} This shows that $u^\dagger u = I + w^\dagger w$ (equation (5)).

Equation (6) can be derived in the same way.

Equation (7) is easy to derive once (1)-(6) are all available: \begin{align} u^\dagger v &= u^\dagger u u^{-1} v \\ &= (I + w^\dagger w) w^\dagger (x^{-1})^\dagger \\ &= w^\dagger (I + w w^\dagger) (x^{-1})^\dagger \\ &= w^\dagger x x^\dagger (x^{-1})^\dagger \\ &= w^\dagger x. \end{align}

This way, we've proven (5)-(7) from (1)-(3), without going up to the higher dimensional "big picture" of the generalized unitary matrix $B$.

Related Question