Let A be a real $m \times n$ matrix. Prove that there are orthogonal matrices $P,Q$ such that $PAQ$ is diagonal, with non-negative diagonal entries.

diagonalizationlinear algebralinear-transformationsmatricesorthogonal matrices

Let A be a real $m \times n$ matrix. Prove that there are orthogonal matrices $P$ in $O_m$, and $Q$ in $O_n$ such that $PAQ$ is diagonal, with non-negative diagonal entries.

If we prove the statement for $n \times n$ invertible matrices we can easily prove the statement for arbitrary $m \times n $ matrices.

The $n \times n $ case can be restated as:

Pseudo-orthogonal Lemma: Given a invertible linear transformation $T$ from $\textbf{R}^a$ to $\textbf{R}^a$ there exists $u\ne \textbf{0} \in \textbf{R}^a$ such that: $\langle u,w\rangle=c\langle T(u),T(w)\rangle$ for all $w \in \textbf{R}^a$ where $c = \frac{\langle u,u\rangle}{\langle T(u),T(u)\rangle}$

OR stated differently: $ \exists u\ne \textbf{0}$ such that $T(W^{\bot})\bot\ T(W)$ where $W=span\{u\}$

We show how the problem can be solved assuming the Pseudo-orthogonal Lemma.

Sub-Lemma: Given an invertible $n\times n $ matrix $B$ there exists $P$ in $O_n$ and $Q$ in $O_n$ such that $P A Q$ is diagonal, with positive entries.

Proof: We prove this by induction on the order of the matrix.

Base case: $n=1$, the sub-lemma is trivially true.

Induction hypothesis: Given an invertible ${k-1}\times{k-1} $ matrix $B$ there exists $P^{(k-1)}$ in $O_{k-1}$ and $Q^{(k-1)}$ in $O_{k-1}$ such that $P A Q$ is diagonal, with positive diagonal entries.

Inductive step: By Pseudo-orthogonal Lemma $\exists w\ne \textbf{0}$ such that $T_B(W^{\bot})\bot\ T_B(W)$ where $W=span\{w\}$. Let $\{w,v_1,..v_{k-1}\}$ be an orthonormal basis of $ \textbf{R}^n$ and let $\{\frac{T_B(w)}{||T_B(w)|| },z_1,..z_{k-1}\}$ be an orthonormal basis of $ \textbf{R}^n$ where $T_B$ is the linear transformation whose matrix with respect to standard basis is $B$.

Now let $T_B^{(k-1)}$ be the linear transformation from $W^{\bot}$ to $Im(T_B^{(k-1)})$ where $T_B^{(k-1)}(x)=T_B(x)$.
Since $T_B(W^{\bot})\bot\ T_B(W)$, $\langle T_B(x), T_B(w)\rangle=0\ \forall\ x \in W^\bot$,

hence $\langle T_B^{(k-1)}(x), T_B(w)\rangle=0\ \forall\ x \in W^\bot$ thus $Im(T_B^{(k-1)})=T_B(W)^\bot$.

As
$B=\{z_1,..z_{k-1}\}$ is an orthonormal basis of $T_B(W)^\bot$, we can write $T(v_i)=\sum_kd_{ik}z_k$.

Thus $
\begin{bmatrix}
-&-&T_B(v1)&-&- \\
-&-&..&-&- \\
-&-&..&-&- \\
-&-&T_B(v_{k-1})&-&- \\
\end{bmatrix}
=
\begin{bmatrix}
d_{11}&d_{12}&-&-&- \\
d_{21}&d_{22}&-&-&- \\
-&-&-&-&- \\
-&-&-&-&- \\
\end{bmatrix}
\begin{bmatrix}
-&-&z_1&-&- \\
-&-&..&-&- \\
-&-&..&-&- \\
-&-&z_{k-1}&-&- \\
\end{bmatrix}
$

Let $D=\begin{bmatrix}
d_{11}&d_{12}&-&-&- \\
d_{21}&d_{22}&-&-&- \\
-&-&-&-&- \\
-&-&-&-&- \\
\end{bmatrix}$

Now by applying induction hyphothesis on $D$ we have $Q_1,Q_2$, real $k-1\times k-1$ orthogonal matrices such that $\Lambda^{(k-1)}=Q_1DQ_2$ is diagonal with non negative entries.

$B^t=
\begin{bmatrix}
-&-&T_B(e_1)&-&- \\
-&-&T_B(e_2)&-&- \\
-&-&..&-&- \\
-&-&..&-&- \\
-&-&T_B(e_k)&-&- \\
\end{bmatrix}
=
\begin{bmatrix}
-&-&T_B(w)&-&- \\
-&-&T_B(v1)&-&- \\
-&-&..&-&- \\
-&-&..&-&- \\
-&-&T_B(v_{k-1})&-&- \\
\end{bmatrix}
\begin{bmatrix}
-&-&w&-&- \\
-&-&v_1&-&- \\
-&-&..&-&- \\
-&-&..&-&- \\
-&-&v_{k-1}&-&- \\
\end{bmatrix}^{-1}
$

We note
$Q_3=\begin{bmatrix}
-&-&w&-&- \\
-&-&v_1&-&- \\
-&-&..&-&- \\
-&-&..&-&- \\
-&-&v_{k-1}&-&- \\
\end{bmatrix}^{-1}$

is a $k \times k$ real orthogonal matrix as $\{w,v_1,..$$v_{k-1}$$\}$ is an orthonormal basis.

$
\begin{bmatrix}
-&-&T_B(w)&-&- \\
-&-&T_B(v_1)&-&- \\
-&-&..&-&- \\
-&-&..&-&- \\
-&-&T_B(v_{k-1})&-&- \\
\end{bmatrix}
=M=
\begin{bmatrix}
||T_B(w)||&0&..&..&0\\
0&d_{11}&d_{12}&-&-\\
0&d_{21}&d_{22}&-&- \\
0&-&-&-&- \\
0&-&-&-&- \\
\end{bmatrix}
\begin{bmatrix}
-&-&\frac{T_B(w)}{||T_B(w)|| }&-&- \\
-&-&z_1&-&- \\
-&-&..&-&- \\
-&-&..&-&- \\
-&-&z_{k-1}&-&- \\
\end{bmatrix}
$

We note
$Q_4=\begin{bmatrix}
-&-&\frac{T_B(w)}{||T_B(w)|| }&-&- \\
-&-&z_1&-&- \\
-&-&..&-&- \\
-&-&..&-&- \\
-&-&z_{k-1}&-&- \\
\end{bmatrix}$

is a $k \times k$ real orthogonal matrix as $\{\frac{T_B(w)}{||T_B(w)|| },z_1,..z_{k-1}\}$ is an

Next we note $Q_5=\begin{bmatrix}
1&\\
0&Q_1^{-1}\\
\end{bmatrix}$

and $Q_6=\begin{bmatrix}
1&\\
0&Q_2^{-1}\\
\end{bmatrix}$

are also orthogonal matrices and
$\begin{bmatrix}
||T_B(w)||&0&..&..&0\\
0&d_{11}&d_{12}&-&-&- \\
0&d_{21}&d_{22}&-&-&- \\
0&-&-&-&-&- \\
0&-&-&-&-&- \\
\end{bmatrix}
=
Q_5
\begin{bmatrix}
||T_B(w)||&\\
0&\Lambda^{(k-1)} \\
\end{bmatrix}
Q_6
$

Thus $B^t=MQ_3 = Q_5
\begin{bmatrix}
||T_B(w)||&\\
0&\Lambda^{(k-1)} \\
\end{bmatrix}
Q_6Q_4Q_3$
.

Since $\Lambda =
\begin{bmatrix}
||T_B(w)||&\\
0&\Lambda^{(k-1)} \\
\end{bmatrix}$
is a diagonal matrix with positive entries and since $Q_7=Q_6Q_4Q_3$ is an orthogonal matrix we have proved the Sub-Lemma.

Best Answer

Let $T^*$ denote the adjoint to $T$ (i.e. the transformation corresponding to the transpose of the matrix of $T$). We find that $\langle Tu,w \rangle = \langle u,T^*w \rangle$ holds for all $u,w \in \Bbb R^a$. Note that in the usual approach, the SVD is proved as a consequence of the spectral theorem and the fact that $(T^*T)^* = T^*T$ (i.e. $T^*T$ is self-adjoint). However, given your unconventional approach to this problem (i.e. your decision not to simply read a textbook), I assume that you want to avoid this.

With that in mind, I begin with the following claim.

Claim: There exists a unit-vector $u$ for which $\|Tu\| = \max_{x \in \Bbb R^a,\|x\| = 1} \|Tx\|$.

This is a consequence of the fact that the unit-ball $\{x: \|x\| = 1\}$ is compact, and the function $f(x) = \|Tx\|$ is continuous. I now claim that as a consequence, it holds that for any $w \perp u$, it holds that $Tw \perp Tu$. In other words, $u$ satisfies the condition of the pseudo-orthogonal lemma.

Indeed, suppose for the purpose of contradiction that $w$ is a unit vector with $w \perp u$, but $\langle Tu,Tw\rangle \neq 0$. It follows that \begin{align} \| T(\cos \theta u + \sin \theta w)\|^2 &= \langle T(\cos \theta u + \sin \theta w), T(\cos \theta u + \sin \theta w)\rangle\\ &= \|Tu\|^2\cos^2\theta + \|Tw\|^2\sin^2 \theta + 2 \langle Tu, Tw\rangle \sin \theta \cos \theta \\ & = \|Tw\|^2 + (\|Tu\|^2 - \|Tw\|^2)\cos^2 \theta + 2 \langle Tu, Tw\rangle \sin \theta \cos \theta \\ & = a + b\cos^2 \theta + c \sin \theta \cos \theta \\ & = a_0 + b_0 \cos(2\theta) + c_0 \sin(2 \theta), \end{align} where $c_0 \neq 0$. By the maximality of $\|Tu\|$, it should hold that the function $$ f(\theta) = a_0 + b_0 \cos(2\theta) + c_0 \sin(2 \theta) $$ attains a maximum at $\theta = 0$. However, we compute $$ f'(\theta) = -2b_0\sin(2 \theta) + 2c_0 \cos(2\theta) \implies f'(0) = 2c_0 \neq 0, $$ which means that $f$ does not attain a maximum at $\theta = 0$, which is a contradiction.

Related Question