The other answers have already given an "abstract" answer, so I will just make sure you understand what all this means for the basic case $V =\mathbb{R}^n$ (and in finite dimensional linear algebra, that is really all there is anyway!)
Let $L: \mathbb{R}^n \to \mathbb{R}^n$ be a linear map. Let $M$ be the matrix of $L$ with respect to the standard basis. This matrix can act on a column vector by multiplication on the left $v \mapsto Mv = L(v)$, or it can act on row vectors by multiplication on the right $w \mapsto wM$. We can convert a row vector into a column vector or vice versa by transposing. This mapping is called the adjoint of $L$.
Row vectors represent maps $\mathbb{R}^n \to \mathbb{R}$, and so really represent elements of the dual space $V^*$. So the adjoint map really is $L^*: V^* \to V^*$
We get a bilinear map $V^* \times V \to \mathbb{R}$ by the rule $(w,v) \mapsto w(L(v)) = wMv$. In other words, the bilinear map associated to $L$ is given by just taking a row vector and a column vector, and sandwiching the matrix of $L$ in between them.
To be perfectly explicit about this, if $L:\mathbb{R}^2 \to \mathbb{R}^2$ has the matrix
$
\begin{bmatrix}
a_{11} &a_{21}\\a_{12}&a_{22}
\end{bmatrix}
$ then $$L\left(\begin{bmatrix} x_1\\ x_2\end{bmatrix}\right) = \begin{bmatrix}
a_{11} &a_{21}\\a_{12}&a_{22}
\end{bmatrix}\begin{bmatrix} x_1\\ x_2\end{bmatrix}$$
and
$$
L^*\left(\begin{bmatrix} y_1& y_2\end{bmatrix}\right) = \begin{bmatrix} y_1& y_2\end{bmatrix} \begin{bmatrix}
a_{11} &a_{21}\\a_{12}&a_{22}
\end{bmatrix}
$$
The bilinear map $B$ is given by
$$
B\left(\begin{bmatrix} y_1& y_2\end{bmatrix},\begin{bmatrix} x_1\\ x_2\end{bmatrix}\right) = \begin{bmatrix} y_1& y_2\end{bmatrix} \begin{bmatrix}
a_{11} &a_{21}\\a_{12}&a_{22}
\end{bmatrix}\begin{bmatrix} x_1\\ x_2\end{bmatrix}
$$
Observe that I can figure out the linear map (i.e. reconstruct the matrix) just by knowing the action of the bilinear map, since $a_{ij} = B(e_j^\top,e_i)$.
This observation motivates the following inverse construction:
Given a bilinear map $B : V^* \times V \to \mathbb{R}$, define a matrix $M$ by $a_{ij} = B(e_j^\top,e_i)$. Since the $e_i$ and $e_j^\top$ span their respective spaces, we see that these values determine the action of $B$, and moreover produce a linear map $L: V \to V$ whose matrix represents the bilinear form.
Note that my answer implicitly makes use of the standard inner product on $\mathbb{R}^n$: the inner product allows me to construct the natural isomorphism $V \to V^*$ given by $v \mapsto \langle v, \cdot \rangle$, which is the "row vector" associated with $v$.
Hopefully this makes things seem a bit less abstract!
You should also note that a similar story does NOT play out for higher order multilinear maps.
There is no canonical way of having $4\times4$ matrices act on $2\times2$ matrices, but notice the space of $2\times2$ matrices is four-dimensional, and linear transformations of this space can be encoded as - you guessed it - $4\times4$ matrices. Write down the relations $(e_{ab}\otimes e_{cd})(e_u\otimes e_v)=\delta_{bu}\delta_{dv}(e_a\otimes e_c)$, where $e_{ij}$ are the elementary matrices and $e_k$ the obvious coordinate vectors, then convert everything on both sides to the relevant matrices via Kronecker products. This tells you how to apply elementary $4\times4$ matrices to elementary $2\times2$ matrices; everything else follows from using the distributive property (write matrices as sums of elementary matrices).
Tensors of vectors can be thought of in different ways. One way is as multidimensional arrays of numbers (usual vectors are one-dimensional arrays, matrices are two-dimensional, etc.), possibly with the "axes" of this array partitioned into upper or lower indices, or covariant versus contra-variant dimensions. This would be highly relevant in physics.
To a pure mathematician, $V\otimes W$ is first and foremost defined implicitly via universal properties, which comes with a canonical explicit construction in terms of the "symbols" $v\otimes w$ which satisfy bilinearity relations. This is my preferred method, as it captures the fact we're trying to left-multiply elements of $V$ against elements of $W$, make scalars in the base field commute with everything, and impose a distributive property.
Finally, as vector spaces there is an isomorphism $K^n\otimes_KK^m\cong M_{n\times m}(K)$ which takes pure tensors of coordinate vectors to their Kronecker product. I suppose this might be useful computationally in some settings, but I am not used to this method.
Best Answer
At some level, the tensor product of maps is, like most things in math, a convenient choice of definition. However, it arises naturally, in a precise sense.
The standard definition of a tensor product of two spaces, $V \otimes W$, actually provides more than a vector space constructed from $V$ and $W$. It is a universal construction, meaning that it satisfies a particular property, and is the best choice of a vector space that does so. There is a bilinear map $i: V \times W \rightarrow V \otimes W$ taking $(v,w) \mapsto v \otimes w$. Now for any bilinear map $f: V \times W \rightarrow U$, there exists a unique linear map $\tilde{f}: V\otimes W \rightarrow U$ such that $\tilde{f}\circ i = f$.
Now if $S: V \rightarrow V'$, $T: W \rightarrow W'$, we want to define a new map $S \otimes T: V \otimes W \rightarrow V' \otimes W'$. We choose this domain and range because we want the tensor product of maps to be compatible with the tensor product of spaces (in a precise sense: we want the association of a tensor product of spaces and maps to be a "bifunctor" - see here).
Now to actually define the map, we appeal to the universal property. Let $(S \times T)(v,w) = S(v) \otimes T(w)$. This map is bilinear from $V \times W$ to $V' \otimes W'$, so it induces the map $(S \otimes T)(v,w): V \otimes W \rightarrow V' \otimes W'$.
These concepts become a bit more clear if you're familiar with the language of category theory - natural constructions, universal properties, functors, etc.