For each pair $(V,W)$ of vector spaces (over a fixed ground field), let $T(V,W)$ be their tensor product, and $F(V,W)$ the vector space of bilinear forms on $V^*\times W^*$. One checks that
(a) there is a unique linear map $e(V,W)$ from $T(V,W)$ to $F(V,W)$ satisfying
$$
\big(e(V,W)(v\otimes w)\big)(f,g)=f(v)g(w)
$$
for all $v\in V,w\in W,f\in V^*,g\in W^*$,
(b) $e(V,W)$ is injective,
(c) $T$ and $F$ are functors,
(d) $e$ is a natural transformation from $T$ to $F$,
(e) $T,F$ and $e$ are compatible (in an obvious sense) with finite direct sums.
Claim 1: $e(V,W)$ is surjective $\iff$ the cardinal number $\dim(V)\dim(W)$ is finite.
In view of (b), implication "$\Leftarrow$" follows by dimension counting. It suffices thus to prove the non-surjectivity when $V$ is infinite dimensional and $W$ nonzero. Writing $W$ as $W_1\oplus W_2$ with $\dim W_1=1$ and using (b), we are reduced to
Claim 2: if $V$ is infinite dimensional, then the canonical embedding $V\to V^{**}$ is not surjective.
To prove this, we'll use an embedding of $V$ in $V^*$, and an embedding of $V^*$ in $ V^{**}$. None of these two embeddings will be canonical, but their composition will.
Choose a basis $B$ of $V$, and identify $V$ to the space $K^{(B)}$ of finitely supported $K$-valued functions on $B$. Then $V^*$ can be identified to the space $K^B$ of all $K$-valued functions on $B$. Similarly, we can identify $V^*$ to $K^{(B\sqcup C)}$, where $C$ is a set and $\sqcup$ means "disjoint union". As $B$ is infinite, $C$ is nonempty. Using the same trick once more, we can identify $V^{**}$ to $K^{(B\sqcup C\sqcup D)}$, where $D$ is a nonempty set. Then the natural embedding of $K^{(B)}$ in $K^{(B\sqcup C\sqcup D)}$, which is clearly not surjective, corresponds to the natural embedding of $V$ in $V^{**}$. This completes the proof.
It's literally immediately from the definition of quotient. If $V/W$ is a quotient space then for $x\in V$ we have $\bar{x}=0$ in $V/W$ iff $x\in W$.
So $(v,w_1+w_2)−(v,w_1)−(v,w_2)=0$ in $V\otimes W=C(V×W)/Z$ since $(v,w_1+w_2)−(v,w_1)−(v,w_2)\in Z$.
If you are still confused you might want to review the definition of quotient space.
Best Answer
The free vector space construction is not just a single sum. Here's an intuitive picture of how I often think of the free vector space construction. We have a set $X$ - note that if this set has any additional structure defined on it, like a binary operation and some axioms, we decide to forget about it for the purpose of our construction. So, either naturally or artificially, we will assume that the elements of $X$ are not things that can truly be added together, like $X=\{\rm \square,~dog,~2\}$. Furthermore, we have a field $K$. We then pretend we can add things in $X$ together, and multiply them by scalars from $K$, and we end up with formal $K$-linear combinations of the form
$$\sum_{x\in X}\alpha_xx= \alpha_{\square} \square+\alpha_{\rm dog}\mathrm{dog}+\alpha_{2} 2.$$
There is a problem with this though. That numeral "$2$" there is supposed to have no structure to it, as part of our construction process, yet it also designates an element of $K$ too! We don't want to confuse ourselves, so let's put the elements of $X$ as subscripts of a generic "$e$." We then have linear combinations of the form
$$\sum_{x\in X}\alpha_x e_x=\alpha_\square e_\square +\alpha_{\rm dog} e_{\rm dog}+\alpha_2e_2.$$
Keep in mind the $\alpha$ coefficients are scalars from $K$. On top of this, we're well within our rights to now "pretend" that the set of all of these formal sums satisfies every single vector space axiom. We have by virtue of imagination created a new vector space out of $X$. (You will notice it is isomorphic to any other vector space of dimension three, or any other free vector space generated by a set of three elements. As I suggested earlier, the actual contents of $X$ are moot, they ultimately play the part of indexing a basis for our space.)
Remark. A vector space does not necessarily allow infinite sums. (It can be defined in finite-dimensional vector spaces, or on finite-dimensional subspaces of vector spaces, over a field that is also a complete metric space so that it allows infinite sums in the scalar field. Look up Hilbert or Banach spaces.) Thus in our formal sums, when we write $x\in X$ in the subscript, we run the risk of having an infinite sum if $X$ is itself infinite! To prevent this occurrence, we may instead write
$$\sum_{i=1}^n \alpha_i e_{x_{\Large i}} :~~ x_i\in X.$$
These are the actual forms our desired formal combinations will now have.
At this point it gets tricky, because we are going to create the free vector space out of another vector space (in fact, out of a Cartesian product of vector spaces). At this point in our discussion, we must completely forget about the fact that $V$ and $W$ are vector spaces and have algebraic structure; temporarily to us they are just sets with no further facts about them available for use. As before, we have our set of formal linear combinations:
$$\left\{\sum_{i=1}^n \alpha_i e_{(v_i,w_i)}: n\in\Bbb N, (v_i,w_i)\in V\times W \right\} $$
Note: These $(v_i,w_i)$ are not the 2-tuples of basis vectors from $V$ and $W$. They are just a set of $n$ arbitrary vectors from $V\times W$. In fact we have designated no basis for $V$ or $W$. The $v_i$ and $w_i$, for each $i$, can be any two elements from $V$ and $W$ respectively.
Now, finally, after all of this, we can remember the vector space structure of $V\times W$ and create the relations given on Wikipedia, and quotient out by them. After the quotient, we can rename the $e_{(v,w)}$ objects (these were the basis vectors when we were still at the stage of a free vector space) as $v\otimes w$. (It gets tedious writing everything in subscripts!)
There are a few very important differences between $V\times W$ and $V\otimes W$ that need to be identified: even though the pure tensors $v\otimes w$ have two components, each taken from $V$ and $W$ resp., our tensor product will have sums of pure tensors (so, e.g., $u\otimes v+x\otimes y$) that cannot always be written as pure tensors, because the two components of pure tensors are independently linear. For example, in $V\times W$, we can split $(v+x,w+y)$ into $(v,w)+(x,y)$ (you cannot split the first component without also having to split the second), whereas in the tensor product we have to split each individually:
$$\begin{array}{c c c} (v+x)\otimes(w+y) && =v\otimes(w+y) & +x\otimes(w+y) \\ && =v\otimes w +v\otimes y & +x\otimes w+x\otimes y. \end{array}$$
Finally, scalar multiplication of $u\otimes v$ does not affect both componenets at once; only one or the other, resulting in $c(u\otimes v)=(cu)\otimes v=u\otimes(cv)$. In the Cartesian product, scalar multiplication affects both components simultaneously, where $c(u,v)=(cu,cv)$. Hope this helps!