[Math] Why do we need an orthonormal basis to represent the adjoint of the operator

adjoint-operatorslinear algebramatrices

For any linear operator on a finite dimensional Inner Product Space, we can get orthonormal basis via Gram Schmidt Process.

But what is the necessity of defining the adjoint of the operator using the orthonormal basis?

Probably it helps with the computation. Why we happen to define like that?

Best Answer

We are free to define what is meant by adjoint of an operator and adjoint of a matrix without any mention of a basis, orthonormal or otherwise. Indeed, we usually don't mention bases in either definition. Taking $\mathbb{F}$ to be either $\mathbb{R}$ or $\mathbb{C}$, the definitions may be stated as:

If $V$ and $W$ are finite-dimensional inner product spaces over $\mathbb{F}$, and $T:V\to W$ is linear, then the adjoint operator $T^{*}:W\to V$ is the unique operator with the property that$$\left<Tv,w\right>=\left<v,T^{*}w\right>$$ for all $v\in V$ and for all $w\in W$.

If $\mathbf{A}$ is a matrix with entries in $\mathbb{F}$, then the adjoint of $\mathbf{A}$ is$$\mathbf{A}^{*}=\overline{\mathbf{A}^{\top}}\mbox{.}$$ But when we define two meanings for the same word, we'd like the two meanings to be somehow related. In the case of the word adjoint, if the matrix $\mathbf{A}$ represents the operator $T$ with respect to bases $\alpha$ and $\beta$ of $V$ and $W$, respectively, then we'd like the adjoint of $\mathbf{A}$ to coincide with the matrix of the adjoint of $T$ with respect to $\beta$ and $\alpha$. That is, we want$$\left(\left[T\right]_{\alpha}^{\beta}\right)^{*}=\left[T^{*}\right]_{\beta}^{\alpha}\mbox{.}$$ The last equation is NOT true in general, but it is true when both $\alpha$ and $\beta$ are orthonormal. So that's where orthonormality becomes “necessary” in a sense. This is a result, however, not a definition. And even with the same definitions above, we can still write $\left[T^{*}\right]_{\beta}^{\alpha}$ in terms of $\left[T\right]_{\alpha}^{\beta}$ without assuming $\alpha$ and $\beta$ are orthonormal. Letting $\alpha=\left\{ \alpha_{1},\ldots,\alpha_{m}\right\}$ and $\beta=\left\{\beta_{1},\ldots,\beta_{n}\right\}$, the formula in general is$$\left[T^{*}\right]_{\beta}^{\alpha}=\mathbf{C}^{-1}\left(\left[T\right]_{\alpha}^{\beta}\right)^{*}\mathbf{B}$$ where$$\mathbf{C}=\left(\begin{array}{cccc} \left<\alpha_{1},\alpha_{1}\right> & \left<\alpha_{2},\alpha_{1}\right> & \cdots & \left<\alpha_{m},\alpha_{1}\right>\\ \left<\alpha_{1},\alpha_{2}\right> & \left<\alpha_{2},\alpha_{2}\right> & \cdots & \left<\alpha_{m},\alpha_{2}\right>\\ \vdots & & & \vdots\\ \left<\alpha_{1},\alpha_{m}\right> & \left<\alpha_{2},\alpha_{m}\right> & \cdots & \left<\alpha_{m},\alpha_{m}\right> \end{array}\right)$$ and$$\mathbf{B}=\left(\begin{array}{cccc} \left<\beta_{1},\beta_{1}\right> & \left<\beta_{2},\beta_{1}\right> & \cdots & \left<\beta_{n},\beta_{1}\right>\\ \left<\beta_{1},\beta_{2}\right> & \left<\beta_{2},\beta_{2}\right> & \cdots & \left<\beta_{n},\beta_{2}\right>\\ \vdots & & & \vdots\\ \left<\beta_{1},\beta_{n}\right> & \left<\beta_{2},\beta_{n}\right> & \cdots & \left<\beta_{n},\beta_{n}\right> \end{array}\right)\mbox{.}$$When orthonormal bases of $V$ and $W$ are not readily available, the formula above for $\left[T^{*}\right]_{\beta}^{\alpha}$ is usually more computationally efficient than applying Gram-Schmidt and change of basis matrices. However, the formula above assumes that inner products are linear in the first slot. If one prefers the definition of inner product which requires linearity in the second slot, then replace $\mathbf{C}$ and $\mathbf{B}$ above by their transposes.