[Physics] Vectors as functions

covariancedifferential-geometrygeneral-relativitytensor-calculusvectors

In my study of general relativity, I came across tensors.
First, I realized that vectors (and covectors and tensors) are objects whose components transform in a certain way (such that the underlying reality is one and observers in different reference systems observe the same thing differently).
This is all very intuitive.
However, I have read that the modern way to learn these concepts is to think of vectors as multilinear functions of covectors (similarly for tensors -functions of vectors and covectors- and covectors -functions of vectors-). The problem is that this is not intuitive to me, and I don't understand why this viewpoint has been adopted.
What is the advantage of thinking of vectors as functions over thinking of them as objects that have independent existence but whose components transform in a particular way?
What is the right mindset for understanding the mathematical concepts of general relativity?

Best Answer

However, I have read that the modern way to learn these concepts is to think of vectors as multilinear functions of covectors

This is actually not quite true, though the distinction is subtle.

In the perspective you describe, one starts with the vector space $V$ as the fundamental structure. It doesn't matter how you construct it - it could be the tangent space to a manifold (as in GR), or the element of some complex Hilbert space (as in QM), or the space of polynomials in some variable.

Covectors are then elements of the algebraic dual space $V^*$, consisting of linear maps from $V$ to the underlying field $\mathbb K$ (usually $\mathbb R$ or $\mathbb C$). $(p,q)$-tensors are multilinear maps which eat $p$ covectors and $q$ vectors and spits out a $\mathbb K$-number.

The linear maps from $V^*\rightarrow \mathbb K$ are not elements of $V$, but rather elements of $V^{**}$, the algebraic dual of $V^*$. However, given any vector $X\in V$, we can define a unique map $f_X\in V^{**}$ to be the one which eats a covector $\omega$ and spits out $$f_X(\omega) := \omega(X)$$

This association $X\mapsto f_X$ between $V$ and $V^{**}$ is one-to-one, and so natural and obvious that we tend to think of elements of $V$ as simply being elements of $V^{**}$. That is what is meant by the statement that vectors can be thought of as functions of covectors - in reality, a vector $X$ is not a function of a covector, but rather can be uniquely associated to a function of a covector $f_X$ in a very natural way.

For finite-dimensional $V$, this association is also surjective, so there is a one-to-one pairing between elements of $V$ and elements of $V^{**}$, which makes it even more reasonable to say that elements of $V$ simply are elements of $V^{**}$. In infinite-dimensional spaces, such as those encountered in QM, this isn't true.


In saying this, my intention is not to merely engage in pointlessly self-indulgent mathematical technicality.

What is the advantage of thinking of vectors as functions over thinking of them as objects that have independent existence but whose components transform in a particular way?

You can certainly think of vectors that way. The key point is that they have a basis-independent existence, which can be captured by (i) endowing their components with the right transformation properties, or (ii) talking about them with no reference to a particular basis at all. In most circumstances, I prefer the latter approach when possible, but that's more a personal preference than an indictment of the former.

It's also worth noting that thinking of tensors as objects which eat covectors and vectors and spit out numbers is nothing you aren't already intuitively familiar with, in the sense that you know that contracting all the indices in an expression yields a scalar.

To me, it also makes understanding the transformation properties of tensor components far cleaner. For example, the metric tensor $g$ is a map which eats two vectors and spits out a number. If I pick some basis $\{\hat e_\mu\}$ and plug $\hat e_\mu$ and $\hat e_\nu$ into the slots of $g$, I get $g(\hat e_\mu,\hat e_\nu) \equiv g_{\mu\nu}$.

That's what the components $g_{\mu\nu}$ are - they are the result of feeding elements of a chosen basis to the tensor $g$. Not only does this make it obvious that the components of a tensor are basis-dependent, but it makes it clear what to do when changing basis from $\hat e_\mu \mapsto \hat \epsilon_\mu = R^\nu_{\ \ \mu} \hat e_\nu$:

$$g'_{\mu\nu} \equiv g\big(\hat \epsilon_\mu,\hat \epsilon_\nu\big) = g\big(R^\alpha_{\ \ \mu}\hat e_\alpha,R^\beta_{\ \ \nu} \hat e_\beta\big) = R^\alpha_{\ \ \mu} R^\beta_{\ \ \nu} g(\hat e_\alpha,\hat e_\beta) \equiv R^\alpha_{\ \ \mu}R^\beta_{\ \ \nu}g_{\alpha\beta}$$

where we freely used the multilinearity of $g$.

This may all be perfectly obvious and natural to you if you have a lot of experience manipulating components, but for me it provides a very clean and natural understanding of the objects I'm manipulating.