Adjoint transformation intuition

adjoint-operatorsdual-spacesinner-productsriesz-representation-theorem

I can't find the connection between the Riesz Representation Theorem and inner product spaces and the adjoint transformation.
what I understood that dual spaces enables us to have an transpose operator, but I can't understand why the the adjoint operator is "adjoint" and the connection with conjugate part in the inner product complex field.
how can I also develop an intuition about how the adjoint operator behaves with inner product space over field C or R?
can someone please help me connect the dots and develop an intuition about how the adjoint behaves? (in the context go finite spaces)
thank you

Best Answer

First let's start with the transpose operator. The transpose can be defined without reference to an inner product and only requires the vector space structure. The way to think about the tranpose is that it turns a linear map $f$ from $V$ to $W$ into a linear map $f^T$ from $W^*$ to $V^*$, and it does this in the most "natural" way possible.

If words, if we are trying to define a map from $W^*$ to $V^*$ we start with a function $g \in W^*$ and we want to turn this into a function that acts on vectors in $V$ and returns a real number. Well we know that $g$ can act on vectors in $W$ and we know that using $f$ we can turn vectors in $V$ into vectors in $W$ and chaining these together gives us a way to let $g$ act on vectors in $V$!

In symbols, what we just did above is define $f^T(g)(v) = g(f(v))$. We first use $f$ to map $v$ from a vector in $V$ to a vector in $W$ and then we use our element $g \in W^*$ to map it to a real number.

Now the Hermitian adjoint operator is a way of turning a function $f: V \to W$ into a function $f^*: W \to V$. The best way of thinking about this operator is through the transpose. The transpose gives us an operator $f^T: W^* \to V^*$. To turn this into a map from $W$ to $V$, we need to fix isomorphisms between $W$ and $W^*$ and $V$ and $V^*$. To map a vector $w$ from $W$ to $V$, you use the isomorphism between $W$ and $W^*$ to map $w$ to $w' \in W^*$, then you use the tranpose to map it to a vector $v' \in V^*$ and then you use the isomorphism between $V$ and $V'$ to map it to a vector $v \in V$.

You could describe the above construction with the following diagram: $\require{AMScd}$ \begin{CD} W^* @>{f^T}>> V^*\\ @VV{\cong}V @VV{\cong}V\\ W @>{f^*}>> V \end{CD}

Now the connection with inner product spaces and the Riesz Representation Theorem is that an inner product on a vector space $V$ gives you a "natural" way to define an isomorphism between $V$ and $V^*$.

For every vector $v \in V$, we can define the map $\varphi_v: V \to K$ (where $K$ is the base field) by $\varphi_v(\cdot) = \langle \cdot,v \rangle$. By the linearity of the inner product, $\varphi_v$ is an element of the dual space $V^*$. In addition the map from $V$ to $V^*$ given by $v \to \varphi_v$ is (anti)-linear and by the non-degeneracy of the inner product, it is injective. The Riesz Representation Theorem says that this injection is also a surjection and therefore a bijection. This means that whenever we have an inner product we can define $\varphi_v$ for every $v \in V$ and the map sending $v$ to $\varphi_v$ is a bijection between $V$ and $V^*$.

Now putting these two concepts together (the tranpose and the connection between inner product spaces and dual spaces), we can define the transpose $f^T$ without the inner product structure as a map from $W^* \to V^*$. The inner product structure gives us isomorphisms between $V$ and $V^*$ and $W$ and $W^*$, and chaining these isomorphisms together with the tranpose we can get a Hermitian adjoint map $f^*$ which maps $W$ to $V$.

EDIT: Added connection with $\mathbb{C}$.

Whenever we start writing linear transformations as matrices we first have to fix a basis. This also gives an alternate way of defining an isomorphism between $V$ and $V^*$ (search for "dual basis") and it is a useful exercise to verify that these two isomorphisms between $V$ and $V^*$ agree if and only if the basis that you started with is orthonormal when the base field is $\mathbb{R}$. In this case, let $v_1,\ldots,v_n$ be an orthonormal basis for $V$ and let $v^1,\ldots,v^n$ be its dual basis. Similarly, let $w_1,\ldots,w_m$ be an orthonormal basis for $W$ and let $w^1,\ldots,w^n$ be its dual basis.

Now if we let the matrix representation of $f$ be

$$\begin{bmatrix} f_{11} & \ldots & f_{1n} \\ \vdots & \ddots & \vdots \\ f_{m1} & \ldots & f_{mn} \end{bmatrix}$$

then we have $f(v_i) = \sum_j f_{ji} w_j$. Now if we consider the transpose we have

$$f^T(w^i)(v_k) = w^i(f(v_k)) = w^i\left( \sum_j f_{jk} w^j \right) = f_{ik} = \left( \sum_j f_{ij} v^j \right)(v_k)$$

so

$$f^T(w^i) = \sum_j f_{ij} v^j$$

so the matrix representation of $f^T$ is exactly the traditional transpose of the matrix representation of $f$.

Now let us consider the adjoint in the case of $K = \mathbb{C}$. In this case we have that the map from $v$ to $\varphi_v$ is anti-linear.

Let $w = \sum_i c_i w_i$. Then $\varphi_w = \sum_i \overline{c_i} w^i$. Now if we apply $f^T$ to this we get

$$f^T(\varphi_w) = f^T\left(\sum_i \overline{c_i} w^i\right) = \sum_j \left(\sum_i \overline{c_i}f_{ij}\right) v^j$$

Finally, when we apply the anti-linear isomorphism between $V$ and $V^*$ we get that

$$ f^*(w) = \left(\sum_i c_i\overline{f_{ij}}\right) v^j $$

so the matrix representation of $f^*$ is

$$\begin{bmatrix} \overline{f_{11}} & \ldots & \overline{f_{1m}} \\ \vdots & \ddots & \vdots \\ \overline{f_{n1}} & \ldots & \overline{f_{nm}} \end{bmatrix}$$

which is the conjugate transpose. The basic intuition is that the isomorphism between $W$ and $W^*$ conjugates all of the coefficients of $w$, then this is fed through the transposed matrix $f^T$ and then all of the coefficients are conjugated once again which flips the coefficients of $w$ back and conjugates the coordinates of $f^T$.

EDIT 2: For completeness, I figured I would add why the adjoint as I defined it above is equivalent to the traditional definition of the adjoint as the unique operator $f^*$ such that for all $v,w$ we have

$$\langle f(v), w \rangle = \langle v, f^*(w) \rangle.$$

If you look at the commutative diagram above, we will start with an element $w \in W$ in the bottom left and trace it using the two paths to the top right. First, we can go right to $f^*(w) \in V$. Then we can use the isomorphism between $V$ and $V^*$ to go up to $\varphi_{f^*(w)}$.

Alternatively, we could first go up using the isomorphism between $W$ and $W^*$ to get to $\varphi_w$ and then move right to get to $f^T(\varphi_w)$ so we have that

$$ f^T(\varphi_w) = \varphi_{f^*(w)} $$

as elements of $V^*$. Now evaluating both sides on $v \in V$ gives

$$ \langle f(v), w \rangle = \varphi_w(f(v)) = f^T(\varphi_w)(v) = \varphi_{f^*(w)}(v) = \langle v, f^*(w) \rangle $$

as desired.