I can understand this $B$ is symmetric. But I cannot understand why $B$ is positive definite.

I am reading "Multivariable Mathematics" by Theodore Shifrin.

Suppose $A$ is a symmetric matrix with associated quadratic form $\mathcal{Q}$.

Suppose $\mathcal{Q}$ is positive definite. Then, in particular, $\mathcal{Q}(\mathbf{e}_1)=a_{11}>0$, so we can write
$$A=\begin{bmatrix}
1\\
\frac{a_{12}}{a_{11}}\\
\vdots\\
\frac{a_{1n}}{a_{11}}\\
\end{bmatrix}
\begin{bmatrix}
a_{11}\\
\end{bmatrix}
\begin{bmatrix}
1&\frac{a_{12}}{a_{11}}&\cdots&\frac{a_{1n}}{a_{11}}
\end{bmatrix}
+\begin{bmatrix}
0&0&\cdots&0 \\
0&b_{11}&\cdots&b_{1\,n-1}\\
\vdots&\vdots&\ddots&\vdots \\
0&b_{n-1\,1}&\cdots&b_{n-1\,n-1} \\
\end{bmatrix},$$
where $\begin{bmatrix}
b_{11}&\cdots&b_{1\,n-1}\\
\vdots&\ddots&\vdots \\
b_{n-1\,1}&\cdots&b_{n-1\,n-1} \\
\end{bmatrix}$ is also symmetric and the quadratic form on $\mathbb{R}^{n-1}$ associated to $\begin{bmatrix}
b_{11}&\cdots&b_{1\,n-1}\\
\vdots&\ddots&\vdots \\
b_{n-1\,1}&\cdots&b_{n-1\,n-1} \\
\end{bmatrix}$ is likewise positive definite.

I can understand $\begin{bmatrix}
b_{11}&\cdots&b_{1\,n-1}\\
\vdots&\ddots&\vdots \\
b_{n-1\,1}&\cdots&b_{n-1\,n-1} \\
\end{bmatrix}$ is also symmetric because both of $A$ and $\begin{bmatrix}
1\\
\frac{a_{12}}{a_{11}}\\
\vdots\\
\frac{a_{1n}}{a_{11}}\\
\end{bmatrix}
\begin{bmatrix}
a_{11}\\
\end{bmatrix}
\begin{bmatrix}
1&\frac{a_{12}}{a_{11}}&\cdots&\frac{a_{1n}}{a_{11}}
\end{bmatrix}$ are symmetric.
But I cannot understand why $\begin{bmatrix}
b_{11}&\cdots&b_{1\,n-1}\\
\vdots&\ddots&\vdots \\
b_{n-1\,1}&\cdots&b_{n-1\,n-1} \\
\end{bmatrix}$ is also a positive definite matrix.

I verified the equality (Please see Professor Shifrin's answer below):

$a_{11}^2 B\mathbf{x}\cdot\mathbf{x}=\mathcal Q\big(a_{11}\tilde{\mathbf x} – (\tilde{\mathbf x}\cdot\mathbf a_1)\mathbf e_1\big)$

Let $\mathbf{x}=\begin{bmatrix}x_1\\\vdots\\x_{n-1}\\\end{bmatrix}$.

Let $\tilde{\mathbf{x}}=\begin{bmatrix} 0\\\mathbf{x}\\\end{bmatrix}$.
$b_{ij}=a_{i+1\,j+1}-\frac{a_{i+1\,1}}{a_{11}}a_{1\,j+1}$ for $i,j$ such that $1\leq i\leq n-1,1\leq j\leq n-1$.

$$B\mathbf{x}\cdot\mathbf{x}=\sum_{i=1}^{n-1} b_{ii}x_i^2+\sum_{1\leq i<j\leq n-1} 2b_{ij}x_i x_j\\=\sum_{i=1}^{n-1}(a_{i+1\,i+1}-\frac{a_{1\,i+1}^2}{a_{11}})x_i^2+\sum_{1\leq i<j\leq n-1} 2(a_{i+1\,j+1}-\frac{a_{1\,i+1}a_{1\,j+1}}{a_{11}})x_i x_j.$$
$$a_{11}^2 B\mathbf{x}\cdot\mathbf{x}=\sum_{i=1}^{n-1}(a_{11}^2 a_{i+1\,i+1}-a_{11} a_{1\,i+1}^2)x_i^2+\sum_{1\leq i<j\leq n-1} 2(a_{11}^2 a_{i+1\,j+1}-a_{11} a_{1\,i+1}a_{1\,j+1})x_i x_j.$$

The coefficient of $x_i^2$ for each $i\in\{1,\dots,n\}$ is $a_{11}^2 a_{i+1\,i+1}-a_{11} a_{1\,i+1}^2$.
The coefficient of $x_i x_j$ for each $i,j$ such that $1\leq i<j\leq n$ is $2(a_{11}^2 a_{i+1\,j+1}-a_{11} a_{1\,i+1}a_{1\,j+1})$.

$$a_{11}\tilde{\mathbf x} – (\tilde{\mathbf x}\cdot\mathbf a_1)\mathbf e_1=\begin{bmatrix}0\\a_{11} x_1\\\vdots\\a_{11} x_{n-1}\\\end{bmatrix}-\begin{bmatrix}a_{21}x_1+a_{31}x_2+\dots+a_{n1}x_{n-1}\\0\\\vdots\\0\\\end{bmatrix}\\=\begin{bmatrix}-(a_{21}x_1+a_{31}x_2+\dots+a_{n1}x_{n-1})\\a_{11} x_1\\\vdots\\a_{11} x_{n-1}\\\end{bmatrix}.$$
$$\mathcal Q\big(a_{11}\tilde{\mathbf x} – (\tilde{\mathbf x}\cdot\mathbf a_1)\mathbf e_1\big)=a_{11}(a_{21}x_1+a_{31}x_2+\dots+a_{n1}x_{n-1})^2+\sum_{i=2}^n a_{ii} (a_{11} x_{i-1})^2\\
+\sum_{2\leq j\leq n} 2 a_{1j}(-a_{21}x_1-a_{31}x_2-\dots-a_{n1}x_{n-1})(a_{11}x_{j-1})\\
+\sum_{2\leq i<j} 2 a_{ij} (a_{11}^2 x_{i-1} x_{j-1}).$$

The coefficient of $x_i^2$ for each $i\in\{1,\dots,n\}$ is $a_{11}a_{i+1\,1}^2+a_{i+1\,i+1}a_{11}^2+2a_{1\,i+1}(-a_{i+1\,1})a_{11}=a_{11}a_{1\,i+1}^2+a_{11}^2a_{i+1\,i+1}-2a_{11}a_{1\,i+1}^2=a_{11}^2 a_{i+1\,i+1}-a_{11} a_{1\,i+1}^2.$
The coefficient of $x_i x_j$ for each $i,j$ such that $1\leq i<j\leq n$ is $a_{11}(2a_{i+1\,1}a_{j+1\,1})+2a_{1\,j+1}(-a_{i+1\,1})a_{11}+2a_{1\,i+1}(-a_{j+1\,1})a_{11}+2a_{i+1\,j+1}a_{11}^2=2a_{11}a_{1\,i+1}a_{1\,j+1}-2a_{11}a_{1\,i+1}a_{1\,j+1}-2a_{11}a_{1\,i+1}a_{1\,j+1}+2a_{11}^2a_{i+1\,j+1}= 2(a_{11}^2 a_{i+1\,j+1}-a_{11} a_{1\,i+1}a_{1\,j+1}).$

I verified the following equality without coordinates (Please see Professor Shifrin's comment below):

$a_{11}^2 B\mathbf{x}\cdot\mathbf{x}=\mathcal Q\big(a_{11}\tilde{\mathbf x} – (\tilde{\mathbf x}\cdot\mathbf a_1)\mathbf e_1\big)$

$$A=\begin{bmatrix}
1\\
\frac{a_{12}}{a_{11}}\\
\vdots\\
\frac{a_{1n}}{a_{11}}\\
\end{bmatrix}
\begin{bmatrix}
a_{11}\\
\end{bmatrix}
\begin{bmatrix}
1&\frac{a_{12}}{a_{11}}&\cdots&\frac{a_{1n}}{a_{11}}
\end{bmatrix}
+\begin{bmatrix}
0&0&\cdots&0 \\
0&b_{11}&\cdots&b_{1\,n-1}\\
\vdots&\vdots&\ddots&\vdots \\
0&b_{n-1\,1}&\cdots&b_{n-1\,n-1} \\
\end{bmatrix}\\=
\frac{1}{a_{11}}\mathbf{a_1}\mathbf{a_1}^T
+\begin{bmatrix}
0&0&\cdots&0 \\
0&b_{11}&\cdots&b_{1\,n-1}\\
\vdots&\vdots&\ddots&\vdots \\
0&b_{n-1\,1}&\cdots&b_{n-1\,n-1} \\
\end{bmatrix}.
$$
$$
\overline{\mathbf{x}}^T A \overline{\mathbf{x}}=\frac{1}{a_{11}}\overline{\mathbf{x}}^T\mathbf{a_1}\mathbf{a_1}^T\overline{\mathbf{x}}+\mathbf{x}^T B \mathbf{x}=\\
A\overline{\mathbf{x}}\cdot\overline{\mathbf{x}}=\frac{1}{a_{11}}(\mathbf{a_1}\cdot\overline{\mathbf{x}})^2+B\mathbf{x}\cdot\mathbf{x}.
$$
$$
\mathcal Q\big(a_{11}\tilde{\mathbf x} – (\tilde{\mathbf x}\cdot\mathbf a_1)\mathbf e_1\big)=
a_{11}^2Q\big(\tilde{\mathbf x}\big)-2a_{11}(\tilde{\mathbf x}\cdot\mathbf a_1)A\tilde{\mathbf x}\cdot\mathbf e_1+
(\tilde{\mathbf x}\cdot\mathbf a_1)^2Q\big(\mathbf{e_1}\big)=\\
a_{11}^2Q\big(\tilde{\mathbf x}\big)-2a_{11}(\mathbf a_1\cdot\tilde{\mathbf x})(\mathbf a_1\cdot\tilde{\mathbf x})+
(\mathbf a_1\cdot\tilde{\mathbf x})^2a_{11}=\\
a_{11}^2Q\big(\tilde{\mathbf x}\big)-a_{11}(\mathbf a_1\cdot\tilde{\mathbf x})^2=
a_{11}^2 B\mathbf{x}\cdot\mathbf{x}.
$$

Best Answer

You ask a good question. Because it was not central to the text, I provided only a "sketch of a proof," but you did omit the hint I gave that will answer your question. In the text, I suggested looking at $\mathcal Q(a_{11}\mathbf e_2-a_{12}\mathbf e_1)$ to see why the first entry of $B$ must be positive. Indeed, $$a_{11}^2 b_{11} = a_{11}(a_{11}a_{22}-a_{12}^2) = \mathcal Q(a_{11}\mathbf e_2-a_{12}\mathbf e_1)>0,$$ so $b_{11}>0$. You can generalize this immediately to get all the diagonal entries of $B$. And it gives the clue to the general situation.

Given $\mathbf x\in\Bbb R^{n-1}$, let $\tilde{\mathbf x} = \left[\begin{matrix} 0 \\ \mathbf x\end{matrix}\right]$. I will leave it to you to verify that $$a_{11}^2 B\mathbf x\cdot\mathbf x = \mathcal Q\big(a_{11}\tilde{\mathbf x} - (\tilde{\mathbf x}\cdot\mathbf a_1)\mathbf e_1\big) > 0$$ whenever $\tilde{\mathbf x}\ne \mathbf 0$. Here $\mathbf a_1$, as usual in the text, is the first column vector of $A$.

The criterion for positive (negative) definiteness in terms of determinants of principal minors is not covered in this text, since it was intended to be primarily a multivariable analysis text. But the criterion in terms of positive (negative) eigenvalues is treated in the exercises in Section 4 of Chapter 9. Sylvester's law of inertia appears in the final exercise (and was also proved in the final lecture posted in my YouTube lectures).

Best Answer

Related Solutions

[Math] Condition number for polynomial interpolation matrix

[Math] Show that multivariate function is convex

Related Question