At some level, the tensor product of maps is, like most things in math, a convenient choice of definition. However, it arises naturally, in a precise sense.
The standard definition of a tensor product of two spaces, $V \otimes W$, actually provides more than a vector space constructed from $V$ and $W$. It is a universal construction, meaning that it satisfies a particular property, and is the best choice of a vector space that does so. There is a bilinear map $i: V \times W \rightarrow V \otimes W$ taking $(v,w) \mapsto v \otimes w$. Now for any bilinear map $f: V \times W \rightarrow U$, there exists a unique linear map $\tilde{f}: V\otimes W \rightarrow U$ such that $\tilde{f}\circ i = f$.
Now if $S: V \rightarrow V'$, $T: W \rightarrow W'$, we want to define a new map $S \otimes T: V \otimes W \rightarrow V' \otimes W'$. We choose this domain and range because we want the tensor product of maps to be compatible with the tensor product of spaces (in a precise sense: we want the association of a tensor product of spaces and maps to be a "bifunctor" - see here).
Now to actually define the map, we appeal to the universal property. Let $(S \times T)(v,w) = S(v) \otimes T(w)$. This map is bilinear from $V \times W$ to $V' \otimes W'$, so it induces the map $(S \otimes T)(v,w): V \otimes W \rightarrow V' \otimes W'$.
These concepts become a bit more clear if you're familiar with the language of category theory - natural constructions, universal properties, functors, etc.
Fix a basis $\{e_1, \ldots, e_n\}$ of $V$, and consider the dual basis $\{f_1, \ldots, f_n \}$ of $V^\ast$. Then we have a basis
$$\{e_1\otimes f_1,\ldots, e_i \otimes f_j, \ldots, e_n \otimes f_n\}$$
for $V \otimes V^\ast$, and the matrix
$$A = (a_{ij})$$
is just a way of representing the element
$$\sum_{i=1}^n \sum_{j=1}^n a_{ij} \; e_i \otimes f_j \in V \otimes V^\ast.$$
Of course an element of $V \otimes V^\ast$ gives a linear map $V \to V$ by
$$(w \otimes f)(v) := f(v) w$$
and extending by linearity. Given two such elements, we can compose the corresponding functions:
$$(w' \otimes f')(w \otimes f)(v) = (w' \otimes f')(f(v) w) = f(v) f'(w) w' = f'(w) \; (w' \otimes f)(v)$$
so composition of linear maps is given by
$$(w' \otimes f') \circ (w \otimes f) = f'(w) \; (w' \otimes f)$$
extended by linearity. If you write your elements in the $e_i \otimes f_j$ basis and apply this operation to them, you'll see that the usual definition of matrix multiplication pops right out.
Of course all the calculations with explicit tensors above can be rephrased in terms of the universal property of the tensor product if you like.
This is all assuming you want the matrix to represent an element of $V \otimes V^\ast$ rather than an element of $V \otimes V$ or $V^\ast \otimes V^\ast$. But you can work out what should happen in cases like that the same way.
Best Answer
The other answers have already given an "abstract" answer, so I will just make sure you understand what all this means for the basic case $V =\mathbb{R}^n$ (and in finite dimensional linear algebra, that is really all there is anyway!)
Let $L: \mathbb{R}^n \to \mathbb{R}^n$ be a linear map. Let $M$ be the matrix of $L$ with respect to the standard basis. This matrix can act on a column vector by multiplication on the left $v \mapsto Mv = L(v)$, or it can act on row vectors by multiplication on the right $w \mapsto wM$. We can convert a row vector into a column vector or vice versa by transposing. This mapping is called the adjoint of $L$.
Row vectors represent maps $\mathbb{R}^n \to \mathbb{R}$, and so really represent elements of the dual space $V^*$. So the adjoint map really is $L^*: V^* \to V^*$
We get a bilinear map $V^* \times V \to \mathbb{R}$ by the rule $(w,v) \mapsto w(L(v)) = wMv$. In other words, the bilinear map associated to $L$ is given by just taking a row vector and a column vector, and sandwiching the matrix of $L$ in between them.
To be perfectly explicit about this, if $L:\mathbb{R}^2 \to \mathbb{R}^2$ has the matrix $ \begin{bmatrix} a_{11} &a_{21}\\a_{12}&a_{22} \end{bmatrix} $ then $$L\left(\begin{bmatrix} x_1\\ x_2\end{bmatrix}\right) = \begin{bmatrix} a_{11} &a_{21}\\a_{12}&a_{22} \end{bmatrix}\begin{bmatrix} x_1\\ x_2\end{bmatrix}$$
and
$$ L^*\left(\begin{bmatrix} y_1& y_2\end{bmatrix}\right) = \begin{bmatrix} y_1& y_2\end{bmatrix} \begin{bmatrix} a_{11} &a_{21}\\a_{12}&a_{22} \end{bmatrix} $$
The bilinear map $B$ is given by
$$ B\left(\begin{bmatrix} y_1& y_2\end{bmatrix},\begin{bmatrix} x_1\\ x_2\end{bmatrix}\right) = \begin{bmatrix} y_1& y_2\end{bmatrix} \begin{bmatrix} a_{11} &a_{21}\\a_{12}&a_{22} \end{bmatrix}\begin{bmatrix} x_1\\ x_2\end{bmatrix} $$
Observe that I can figure out the linear map (i.e. reconstruct the matrix) just by knowing the action of the bilinear map, since $a_{ij} = B(e_j^\top,e_i)$.
This observation motivates the following inverse construction:
Given a bilinear map $B : V^* \times V \to \mathbb{R}$, define a matrix $M$ by $a_{ij} = B(e_j^\top,e_i)$. Since the $e_i$ and $e_j^\top$ span their respective spaces, we see that these values determine the action of $B$, and moreover produce a linear map $L: V \to V$ whose matrix represents the bilinear form.
Note that my answer implicitly makes use of the standard inner product on $\mathbb{R}^n$: the inner product allows me to construct the natural isomorphism $V \to V^*$ given by $v \mapsto \langle v, \cdot \rangle$, which is the "row vector" associated with $v$.
Hopefully this makes things seem a bit less abstract!
You should also note that a similar story does NOT play out for higher order multilinear maps.