Differential Geometry – Is the Matrix $[g_{\mu\nu}]$ of Metric Tensor a Linear Operator?

coordinate systemsdifferential-geometrylinear algebrametric-tensoroperators

Starting in an arbitrary coordinate system with basis vectors $\textbf{e}_\mu$ and metric components $g_{\mu\nu}$, we can always diagonalize the matrix $[g_{\mu\nu}]$ so that the components of the metric are those of the canonical form $\eta_{\mu\nu}=\text{diag}(-1,-1,…,+1,+1,..,0,0,…,0)$.
This is similar to a coordinate transformation where the new basis vectors $\textbf{e}'_\mu$ satisfy $\textbf{g}(\textbf{e}'_\mu,\textbf{e}'_\nu)=\eta_{\mu\nu}$ ($\textbf{g}$ is the metric tensor).

In linear algebra, I learned that diagonalization of matrices to find orthogonal basis vectors is only valid for square matrices that are represent linear operators. However, the metric tensor is not a linear operator as it acts on two arguments instead of one. Why can we still use the diagonalization process to find orthogonal basis vectors?

Best Answer

As you may know, the metric tensor is a bilinear 2-form. It accepts two vectors from vector space $V$ and gives back a real number in $\mathbb R$. It is linear in both arguments, hence 'bilinear'. The metric tensor is interpreted as a linear operator in the sense that it maps one of its arguments (either one; doesn't matter because it's symmetric) to a dual vector in $V^*$. This dual vector is interpreted as a functional on $V$ (the traditional definition of the dual space of $V$), which acts on the second vector to give a scalar value. So $g$ is a linear map from $V$ to $V^*$. When you write $g$ as a matrix and operate on a column vector $v$, transpose the resulting vector to make it a row vector and you have the dual vector $v^*$.

From a general point of view, the metric tensor is a rank 2 tensor, specifically a rank $(0,2)$ tensor. In general, a rank $(n,m)$ tensor is a multilinear functional which acts on an ordered collection of vectors in $V$ and dual vectors in the dual space $V^*$. For a vector space $V$ over a field $\mathbb F$ (usually $\mathbb R$ or $\mathbb C$), a tensor $T$ is a multilinear map of the form

$$ T : V^m \times V^{*n} \rightarrow \mathbb F .$$

Rank $(0,2)$ tensors over the real numbers, like $g_{\mu \nu}$,

$$ g : V \times V \rightarrow \mathbb R$$

are particularly interesting as they often appear in mathematics and physics. This is because they define inner products. The inner product between two vectors $\begin{pmatrix}a_1\\a_2\end{pmatrix}$ and $\begin{pmatrix}b_1\\b_2\end{pmatrix}$ in an inner product space $V$ is

$$\begin{pmatrix}a_1\\a_2\end{pmatrix} \cdot \begin{pmatrix}b_1\\b_2\end{pmatrix} = \begin{pmatrix}a_1&a_2\end{pmatrix} \begin{pmatrix}A_{11}&A_{12}\\A_{21}&A_{22}\end{pmatrix} \begin{pmatrix}b_1\\b_2\end{pmatrix}$$

where $\mathbf A$ forms a symmetric positive-definite matrix (symmetric with positive real eigenvalues). By convention we normally write vectors in $V$ in an orthonormal basis, which is a basis that diagonalises $\mathbf A$ to the identity matrix, and so we usually omit $\mathbf A$ entirely when taking inner products because of this orthonormal choice of basis: $$\begin{pmatrix}a_1\\a_2\end{pmatrix} \cdot \begin{pmatrix}b_1\\b_2\end{pmatrix} = \begin{pmatrix}a_1&a_2\end{pmatrix} \begin{pmatrix}b_1\\b_2\end{pmatrix}$$ when the vectors are written in an orthonormal basis.

These inner products $\mathbf A$ are basically the same thing as metric tensors $g$. Two terms for one concept. Of course in pseudoriemannian geometry, $\mathbf A$/$g$ need not be positive-definite. It is clear how $\mathbf A$ should be interpreted as a linear operator though, right? It maps the vector $\mathbf b$ to its dual vector $\mathbf b^*$ like so: $$\mathbf b^* (\mathbf a) = \mathbf a \cdot \mathbf b \tag{definition of dual vector space $V^*$}$$ $$\begin{align}\mathbf a \cdot \mathbf b &= \begin{pmatrix}a_1&a_2\end{pmatrix} \begin{pmatrix}A_{11}&A_{12}\\A_{21}&A_{22}\end{pmatrix} \begin{pmatrix}b_1\\b_2\end{pmatrix} \\ &= \left[ \begin{pmatrix}A_{11}&A_{12}\\A_{21}&A_{22}\end{pmatrix} \begin{pmatrix}b_1\\b_2\end{pmatrix} \right]^{\mathrm T} \begin{pmatrix}a_1\\a_2\end{pmatrix} \\ &\Rightarrow \quad \mathbf b^* = \left[ \begin{pmatrix}A_{11}&A_{12}\\A_{21}&A_{22}\end{pmatrix} \begin{pmatrix}b_1\\b_2\end{pmatrix} \right]^{\mathrm T} \end{align}.$$ Having to take the transpose makes this a little confusing, but it should be clear that $\mathbf A$ defines a dual vector $\mathbf b^*$ for each vector $\mathbf b$.

The concept of a metric tensor is basically the same thing, but with a different notation. I could not say earlier that $\mathbf A$ maps a vector to its dual, but that it defines such a map. This is because I had to use the transpose operation. Matrices are a notation designed to express vectors $V$ and linear operators $M : V \rightarrow V$, and the notation is not flexible enough to express a linear map $V \rightarrow V^*$. The notation used for expressing metric tensors (upper/lower index notation; tensor notation; not sure if it has a better name) is more flexible. The metric is denoted $g$, and by writing it with two lower indices as $g_{\mu \nu}$ we are designating it as a rank $(0,2)$ tensor that maps $V \times V \rightarrow \mathbb R$. By giving $g_{\mu \nu}$ just one argument and leaving the other empty, we are left with a map $V \rightarrow \mathbb R$, which is the same thing as a dual vector in $V^*$. We write vectors by their components, $x^\mu$, and then $g$ defines a linear map $g : V \rightarrow V^*$ like so: $$g : \mathbf x \mapsto \mathbf x^*, \quad x_\mu = \sum_{\nu} g_{\mu \nu} x^{\nu}.$$ The notation $x^\mu$ expresses the components of the vector $\mathbf x$ in the chosen basis of $V$, and $x_\mu$ expresses the components of the dual vector $\mathbf x^*$ in the dual basis of $V$, i.e. the corresponding basis in $V^*$. The notation is frequently heavily abused for brevity, so you may see expressions like $$g : x^\mu \rightarrow g_{\mu \nu} x^\nu \tag{implied summation over $\nu$}$$ to mean the same thing as I said above.

You may notice the similarity with matrix multiplication: $$(b^*)_\mu = \sum_{\nu} A_{\mu \nu} b_\nu.$$ When $g$ is expressed as a matrix as in your question, it simply maps the components $x^\mu$ to the components of its dual vector, $x_\mu$. It very much is a linear map, $g : V \rightarrow V^*$ and all the associated tools of linear analysis may be applied.