You have a couple of misconceptions here.
First, the tensor product is not defined as a quotient of the vector space $V\times W$. Rather, you consider a vector space $Z$ whose basis elements are the elements of $V\times W$. So, for example, if $V=W=\mathbb{R}$, you would have one basis element for $(1,0)$, another basis element for $(2,0)$, another basis element for $(3,0)$, etc. That is why in that post they are written with double brackets, so as not to confuse $Z$ with $V\times W$.
Note that this is way bigger than $V\times W$. The vector space $V\times W$ has dimension $\dim(V)+\dim(W)$. The vector space $Z$ has dimension $|V\times W|$, which is, usually, much larger! For $V=W=\mathbb{R}$, $V\times W$ has dimension $2$, but $Z$ has dimension $\mathfrak{c}=2^{\aleph_0}$.
So you have one basis element for each element of $V\times W$; you should think of $V\times W$ as the index set for the basis. The basis element $[[v,w]]$ is the basis element that corresponds to the element $(v,w)$ of $V\times W$.
Then $E$ is the subspace of $Z$ generated be all the relations you write down; so, in my example above, you would have the vector $2[[1,0]]-[[2,0]]$ in $E$, etc.
Now, the image of the basis vector $[[v,w]]$ in the quotient is denoted by $v\otimes w$. So in general it's not every vector of $Z/E$ that can be written as $v\otimes w$: these are only the images of the basis of $Z$. So you know that these elements generate $Z/E$, but they need not be all of $Z/E$ (in general, they won't be). The elements of $Z/E$ are linear combinations of these "pure tensors" $v\otimes w$.
So, why does it follow from the construction that $(v_1+v_2)\otimes w = (v_1\otimes w) + (v_2\otimes w)$?
This equality is saying that the equivalence class of $[[v_1+v_2,w]]$ is the same as the equivalence class of $[[v_1,w]]+[[v_2,w]]$ in the quotient. By definition of quotient, this is the same as saying that the vector
$$[[v_1+v_2,w]] - [[v_1,w]] - [[v_2,w]]$$
of $Z$ lies in the subspace $E$. But it lies in the subspace $E$ because it is one of the generating elements of $E$. So the equality holds in $Z/E$.
The free vector space construction is not just a single sum. Here's an intuitive picture of how I often think of the free vector space construction. We have a set $X$ - note that if this set has any additional structure defined on it, like a binary operation and some axioms, we decide to forget about it for the purpose of our construction. So, either naturally or artificially, we will assume that the elements of $X$ are not things that can truly be added together, like $X=\{\rm \square,~dog,~2\}$. Furthermore, we have a field $K$. We then pretend we can add things in $X$ together, and multiply them by scalars from $K$, and we end up with formal $K$-linear combinations of the form
$$\sum_{x\in X}\alpha_xx= \alpha_{\square} \square+\alpha_{\rm dog}\mathrm{dog}+\alpha_{2} 2.$$
There is a problem with this though. That numeral "$2$" there is supposed to have no structure to it, as part of our construction process, yet it also designates an element of $K$ too! We don't want to confuse ourselves, so let's put the elements of $X$ as subscripts of a generic "$e$." We then have linear combinations of the form
$$\sum_{x\in X}\alpha_x e_x=\alpha_\square e_\square +\alpha_{\rm dog} e_{\rm dog}+\alpha_2e_2.$$
Keep in mind the $\alpha$ coefficients are scalars from $K$. On top of this, we're well within our rights to now "pretend" that the set of all of these formal sums satisfies every single vector space axiom. We have by virtue of imagination created a new vector space out of $X$. (You will notice it is isomorphic to any other vector space of dimension three, or any other free vector space generated by a set of three elements. As I suggested earlier, the actual contents of $X$ are moot, they ultimately play the part of indexing a basis for our space.)
Remark. A vector space does not necessarily allow infinite sums. (It can be defined in finite-dimensional vector spaces, or on finite-dimensional subspaces of vector spaces, over a field that is also a complete metric space so that it allows infinite sums in the scalar field. Look up Hilbert or Banach spaces.) Thus in our formal sums, when we write $x\in X$ in the subscript, we run the risk of having an infinite sum if $X$ is itself infinite! To prevent this occurrence, we may instead write
$$\sum_{i=1}^n \alpha_i e_{x_{\Large i}} :~~ x_i\in X.$$
These are the actual forms our desired formal combinations will now have.
At this point it gets tricky, because we are going to create the free vector space out of another vector space (in fact, out of a Cartesian product of vector spaces). At this point in our discussion, we must completely forget about the fact that $V$ and $W$ are vector spaces and have algebraic structure; temporarily to us they are just sets with no further facts about them available for use. As before, we have our set of formal linear combinations:
$$\left\{\sum_{i=1}^n \alpha_i e_{(v_i,w_i)}: n\in\Bbb N, (v_i,w_i)\in V\times W \right\} $$
Note: These $(v_i,w_i)$ are not the 2-tuples of basis vectors from $V$ and $W$. They are just a set of $n$ arbitrary vectors from $V\times W$. In fact we have designated no basis for $V$ or $W$. The $v_i$ and $w_i$, for each $i$, can be any two elements from $V$ and $W$ respectively.
Now, finally, after all of this, we can remember the vector space structure of $V\times W$ and create the relations given on Wikipedia, and quotient out by them. After the quotient, we can rename the $e_{(v,w)}$ objects (these were the basis vectors when we were still at the stage of a free vector space) as $v\otimes w$. (It gets tedious writing everything in subscripts!)
There are a few very important differences between $V\times W$ and $V\otimes W$ that need to be identified: even though the pure tensors $v\otimes w$ have two components, each taken from $V$ and $W$ resp., our tensor product will have sums of pure tensors (so, e.g., $u\otimes v+x\otimes y$) that cannot always be written as pure tensors, because the two components of pure tensors are independently linear. For example, in $V\times W$, we can split $(v+x,w+y)$ into $(v,w)+(x,y)$ (you cannot split the first component without also having to split the second), whereas in the tensor product we have to split each individually:
$$\begin{array}{c c c} (v+x)\otimes(w+y) && =v\otimes(w+y) & +x\otimes(w+y) \\ && =v\otimes w +v\otimes y & +x\otimes w+x\otimes y. \end{array}$$
Finally, scalar multiplication of $u\otimes v$ does not affect both componenets at once; only one or the other, resulting in $c(u\otimes v)=(cu)\otimes v=u\otimes(cv)$. In the Cartesian product, scalar multiplication affects both components simultaneously, where $c(u,v)=(cu,cv)$. Hope this helps!
Best Answer
The major idea behind the tensor product $V\otimes W$ is that it allows us to study bilinear maps $\omega:V\times W\to Z$ as linear maps $\tilde\omega :V\otimes W\to Z$ on a space like $V\times W$ but with the bilinearity 'built-in', thus reducing the theory of bilinear maps to 'simple' linear algebra. In fact, this construction works for more generally multilinear maps $\omega : V_1\times\dots\times V_n\to Z$ as well ($\tilde\omega : V_1\otimes\dots\otimes V_n\to Z$).