Checking the Understanding of and Motivation behind Tensors

dual-spacestensor-productstensorsvector-spaces

I'm trying to relate different definitions of tensors and check my understanding. Feel free to skim the background, but if something's wrong even slightly it may be the source of my confusion (which it has been many times in the past). I added some prompts too in case I want to use this as a refresher for myself later or to teach. Any input is welcome anywhere since I'm teaching this to myself, thanks in advance!

Note: If there's nothing wrong with this I'll have wasted my time, but I highly doubt that since much of this is me guessing. I know, this information can be found outside of MSE, but I'm looking for subtle things like figuring out if I'm using words or phrases incorrectly, trying to add meaning and motivation where there is none, ignoring important details, or not tying things together correctly. I'm struggling to find that elsewhere.

Background:

  • A vector is defined as an element of a vector space, a set closed under linear combinations. Matrices can be vectors if they're part of a vector space. This is a more abstract definition, and if you want them to be, colors, objects, and even functions can be vectors (think about why). That is, adding two vectors in the space gives you another vector in the space and if a vector is multiplied by scalar (an element of the field it's defined over), it gives you another element of the vector space.
  • Linearity is highly important, differentiable functions are linear locally, and linear functions are fundamental since they're some of the simplest, representing a "copy (multiply by a scalar, a) and paste (add to something else)". As a result, every function we define from here on we should try to make linear (or also multilinear, which will be explained in a moment). A function is linear ($f_1$ is the type of notation I'll use) if $f_1(av)=af_1(v)$ and $f_1(v+w)=f_1(v)+f_1(w)$, and bilinear if it's a multivariable function linear in each variable separately (when the others are held still), so $f_2(ax,y)=f_2(x,ay)=af_2(x,y)$ and in each component, $f_2(u+v,w)=f_2(u,w)+f_2(v,w)$. A function can also be trilinear ($f_3$), etc., and multilinear ($f_n$). Effectively, when all but one variables are held still, a multilinear map affects each argument linearly.
  • When you choose a basis for your vector space, you can sensibly represent a vector using an array of elements called a matrix. $n\times1$ (column vectors), $1\times m$ (row vectors), and $n\times m$ matrices are a few types. Keep in mind there are other ways to represent vectors though. Any number of different matrices could represent the same vector, for instance the same 1 by 3 column vector could be represented as a 2 by 2 matrix with one slot being 0 and the elements switched around. However, you usually choose the representation that makes computations appear as they would for basic linear algebra with geometric vectors.
  • Multiplication of b by a can loosely be interpreted as a as a function of b. Resultantly, depending on your rules and the nature of the vectors themselves as abstract objects there will be different multiplication rules for your representation.
  • A mapping can also show this by naming a function f and through some kind of function composition (or multiplication), f takes elements in the domain x to the range y, denoted as $f:X\to Y$. For instance $sqrt:R\to R$, and $sqrt(x)=x^2$. Since f is a name it just has to encode information about the mapping. To clarify, through the multiplication or a function rule the information works to assign an output.
  • Vector spaces imply the existence of elements of a field with it, and without creating anything else that's all we have defined and can work with. We can define linear maps out of the vector space either to the vector space (these maps must exist in order for it to have been called a vector space) or to the field (which can be done with a second or dual vector space). It's usually denoted $V^*$, and is itself a vector space. Elements of dual space are called covectors.

Tensors:

  • As the name suggests, tensors "tense" objects (vector spaces, but more generally modules) into other vector spaces (in a way that treats each space linearly, of course). Thus, tensors are multilinear maps. Additionally, since the linear transformations from the tensor on each vector can be encoded a vector, tensors should also be able to be vectors, which means they have to be able to be part of a vector space.
  • A multilinear transformation contains multiple sets of linear transformation information, each of which can be considered a vector. These vectors are used to construct a tensor, using a combining method (a product, which creates tensors, hence the tensor product). Note: to fully reveal the nature of a tensor, it's often better to keep these vectors as just vectors, and not other tensors (of higher rank), although you could just as well do that too. But tensors are objects in and of themselves, and don't have to just act as maps. They can be mapped. Now suppose we want to use another function to transform the linear information – it'll have to act multi-linearly. However, the point of the tensor product was to create one new space (and while I'm at it, a tensor product space or TPS), so it wouldn't make sense to talk about a multilinear map, which takes multiple inputs, rather only linear ones. Thus, the only and universal property the tensor product must satisfy is that for each multilinear map out of a set of vector spaces there's a corresponding linear map out of the tensor product. This implies the tensor product of vector spaces is, in general terms, the output of some arbitrary multilinear function.
  • A tensor is defined as the tensor product of any number of vector spaces. The rank is the number of vector spaces involved. A tensor of type/valence $(r,s)$ consists of r vector spaces and s dual vectors spaces.
  • Although there are other ways to represent tensors of each rank as arrays, to make them easier to work with in common linear algebra a rank zero tensor can be realized as a scalar, rank one tensor as a 1D array, two as a 2D array, three as 3D, and so on.
  • Any tensor could map anything to anything else, but there's always a multiplication rule that takes a type $(r,s)$ tensor as a map from r dual spaces and s vector spaces to an element of their field by pairing each covector to a vector and taking the product of each pair.

I'm especially interested in the last bullet point under tensors, as the people I've asked who define tensors as this mapping far outnumbers the number of people who define them only in terms of the universal property of the tensor product (which makes much more sense to me). People always talk about this like it's the most pure and general definition which I sort of dislike, even though admittedly it's an easy way to think about tensors.

Best Answer

Regarding the Background:

The first two bullets are fine. The "copy paste" metaphor is interesting.

Bullet 3: I'm not quite sure what you're getting at with this excerpt

Any number of different matrices could represent the same vector, for instance the same 1 by 3 column vector could be represented as a 2 by 2 matrix with one slot being 0 and the elements switched around. However, you usually choose the representation that makes computations appear as they would for basic linear algebra with geometric vectors.

but of course the gist, i.e. that row/column vectors and matrices can be used to flexibly represent vector spaces, is correct.

Bullet 4: Really not sure what you're trying to get here. I'm not sure how to interpret the sentence "multiplication of b by a can loosely be interpreted as a as a function of b".

Bullet 5: Not sure what this is supposed to say. I think you're just explaining the notation $f:X \to Y$, but your wording and choices of notation are awkward.

Bullet 6: Mostly correct, but subtly wrong. The phrasing of "vector spaces imply the existence of elements of a field with it" bothers me. It's not clear what you mean by "these maps must exist in order for it to have been called a vector space". In the end, it seems like you're trying to say something like "The dual space $V^*$ of $V$ is the set of linear maps from $V$ to its underlying field. The elements of a dual space are called covectors". I think you have the right idea, it's just not very readable right now.

Regarding Tensors:

Bullet 1: I haven't heard "tense" used as a verb in this sense. The sentence Additionally, since the linear transformations from the tensor on each vector can be encoded a vector, tensors should also be able to be vectors, which means they have to be able to be part of a vector space is unclear.

Bullet 2: "A multilinear transformation contains multiple sets of linear transformation information, each of which can be considered a vector": Not clear what "contain" means here. I'm really not sure what exactly you're trying to convey in the rest of this paragraph.

Bullet 3: A tensor is defined as an element of the tensor product of any number of vector spaces. Otherwise fine.

Your last two bullets are fine.


What makes all of this really confusing is that in some contexts, it is convenient to think of tensors as multilinear maps, while in other contexts it is convenient to think of tensors as being elements of the fancy vector space that we call the "tensor product" of the input spaces. It is common in the context in the exposition of the relevant fields to completely ignore the alternate points of view.

I have found that in differential geometry, the multilinear map point of view is more common. I think that the "multidimensional array" point of view is most directly connected to this multilinear map definition of a tensor product.

The advantage of the more abstract definition via tensor products of spaces is that all of the maps that we care about are simply linear maps (or in the greater algebraic context, module homomorphisms).

Related Question