Normed Spaces – Understanding Weighted Inner Product and Weighted Norms

I am reading this book where at page 27 following definitions about weighted inner product and weighted norms are given.

Let $M$ and $N$ be Hermitian positive definite matrices of order $m$ and $n$ respectively. The weighted inner products in $\mathbb{C}^{m}$ and $\mathbb{C}^{n}$ are

$(x, y)_{M} = y^{*}Mx$ , $x, y \in \mathbb{C}^{m}$ and $(x, y)_{N} = y^{*}Nx$ , $x, y \in \mathbb{C}^{n}$ ….$(1)$

The definitions of weighted vector norms are

$\|x\|_{M} = (x, x)^\frac{1}{2}_{M} = (x^{*}Mx)^\frac{1}{2} = \|M^\frac{1}{2} x\|_{2}$, $x\in\mathbb{C}^{m}$ ….$(2)$

$\|x\|_{N} = (x, x)^\frac{1}{2}_{N} = (x^{*}Nx)^\frac{1}{2} = \|N^\frac{1}{2} x\|_{2}$, $x\in\mathbb{C}^{n}$ ….$(3)$

The definitions of weighted matrix norm are

$\|A\|_{MN} = \max_{\|x\|_{N} = 1}{\|Ax\|_{M}},\; x \in\mathbb{C}^n and ~~A\in \mathbb{C}^{m\times n}$

$\|B\|_{NM} = \max_{\|x\|_{M} = 1}{\|Bx\|_{N}},\; x \in\mathbb{C}^n and ~~B\in \mathbb{C}^{n\times m}$

Such a norm is sometimes called an operator norm subordinate to vector norm. It is easy to verify that

$\|A\|_{MN} = \|M^\frac{1}{2} A N^\frac{-1}{2} \|_{2}$ ….$(4)$

$\|B\|_{NM} = \|N^\frac{1}{2} B M^\frac{-1}{2} \|_{2}$ ….$(5)$

Could anybody explain me about the significance of weighted norms? Why we need weighted norm? In $(2)$ how we got $\|M^\frac{1}{2} x\|_{2}$? How could we find square root of matrix $M$? How did we got equation $(4)$ and $(5)$.

I would be very much thankful for the help and suggestions.

Best Answer

Weighted norms have a variety of uses. Suppose you're measuring the size of vectors that are coming out of some random or physical process, and they look like this: $$ \begin{bmatrix} +5.4\times 10^{-10} \\ -1.3\times 10^{+6} \\ \end{bmatrix} \begin{bmatrix} +1.8\times 10^{-9} \\ -4.3\times 10^{+5} \\ \end{bmatrix} \begin{bmatrix} -2.3\times 10^{-9} \\ +3.4\times 10^{+5} \\ \end{bmatrix} \begin{bmatrix} +8.6\times 10^{-10} \\ +3.6\times 10^{+6} \\ \end{bmatrix} \begin{bmatrix} -3.2\times 10^{-10} \\ +2.7\times 10^{+6} \\ \end{bmatrix} $$ Would it make sense to use the standard Euclidean norm $\|\cdot\|_2$ to measure the size of these vectors? I say no. The values of $x_1$ hover around $10^{-9}$, $x_2$ around $10^6$. Since $x_1$ is so much smaller than $x_2$, $\|x\|_2\approx |x_2|$. You're losing information about $x_1$ with this measurement.

What you might choose to do in this circumstance is select a diagonally weighted norm $\|x\|_D\triangleq\sqrt{x^*Dx}$, with the values of $D_{ii}>0$ chosen to "normalize" each entry. For instance, I might choose $D_{11}=10^{18}$ and $D_{22}=10^{-12}$. The values of $D^{1/2} x$ are $$ \begin{bmatrix} +0.54 \\ -1.3 \end{bmatrix} \begin{bmatrix} +1.8 \\ -0.43 \end{bmatrix} \begin{bmatrix} -2.3 \\ +0.34 \end{bmatrix} \begin{bmatrix} +0.86 \\ +3.6 \end{bmatrix} \begin{bmatrix} -0.32 \\ +2.7 \end{bmatrix} $$ Now small relative changes in $x_1$ will have approximately the same impact on the norm $\|x\|_D=\sqrt{x^*Dx}=\|D^{1/2}x\|_2$ as small relative changes in $x_2$. This is probably a more informative norm for this set of vectors than a standard Euclidean norm.

Diagonally weighted norms are probably the easiest to justify intuitively, but in fact more general weighted norms have their uses. For instance, they come up often in proofs about Newton's method.

For information about matrix square roots, Wikipedia really is not a bad place to start, or any reasonably good linear algebra text. Square roots exist for any Hermitian positive semidefinite matrix---that is, any Hermitian matrix with nonnegative real eigenvalues.

Two types of square roots are typically considered for a real symmetric/complex Hermitian PSD matrix $M$. The lower triangular Cholesky factor $L$ satisfying $M=LL^*$ is simpler to compute in practice. But the symmetric/Hermitian square root $Q=M^{1/2}$ satisfying $M=Q^2$ is often preferred in proofs, because then you don't have to keep track of transposes, and because sometimes it is helpful for $Q$ and $M$ to share eigenvectors.

With the symmetric square root defined, the derivations for (2) are straightforward: $$\|M^{1/2}x\|_2 = \left(x^*M^{*/2}M^{1/2}x\right)^{1/2} = \left(x^*M^{1/2}M^{1/2}x\right)^{1/2} = \left(x^*Mx\right)^{1/2} = \|x\|_M.$$ Here is a derivation for (4). First, we convert the numerator: $$\|M^{1/2}AN^{-1/2}\|_2 = \max_{\|x\|_2=1} \|M^{1/2}(AN^{-1/2}x)\|_2 = \max_{\|x\|_2=1} \|AN^{-1/2}x\|_M$$ Now we define $y=N^{-1/2} x$, or $x=N^{1/2} y$: $$\max_{\|x\|_2=1} \|AN^{-1/2}x\|_M = \max_{\|N^{1/2} y\|_2=1} \|Ay\|_M = \max_{\|y\|_N=1}\|Ay\|_M.$$

Best Answer

Related Solutions

Related Question