By definition, $0!=1$. There are a number of different justifications for this. One of them is that this choice makes the formulas you quote give the right answer. One can also argue that there really is exactly one way to choose $0$ objects from $n$: just say no in turn to each of them.
When $k$ and $n$ are non-negative integers, and $k\gt n$, we have two choices for $\binom{n}{k}$. We could say it is undefined. But it is more convenient to adopt the convention that in that situation, $\binom{n}{k}=0$. Again one could give an informal justification, there are $0$ ways to choose $5$ people from $3$. But the real reason for the convention is that it makes some formulas less messy.
I think this is about selection with constraints, which is similar to partition with constraints.
We want to select $r$ objects from a population with $t$ various types but we have limits on how many of some/all the types are available, with inclusion/exclusion to cope with multiple cases of limit breaking.
So we partition $r$ into $t$ types freely, then subtract off any cases which might break the constraints, then add back in double counted constraint breaking, etc.
For example choosing $r=5$ letters form "chincherinchee". There are $t=6$ letters $(t_i)=$ (c,h,i,n,e,r) with limits $(\ell_i)=(3,3,2,2,3,1)$.
The basic free selection from those $6$ types is sticks-and-stones (or stars&bars) division into categories, $$\binom{5+6-1}{6-1}=\binom{10}{5}$$
Then to remove cases where we have selected more of a given type than allowed, we repeat this division but with each type in turn allocated a limit-breaking $\ell_i{+}1$, which is removed from the total available. So for example when we consider breaking the limit on letter "n"($c_4$), this means we remove $\ell_4+1=3$ items from general allocation because they are preallocated to break the limit for "n", meaning the options for this case are $\binom{2+6-1}{6-1}=\binom{7}{5}$. So over all possible letters this adjusts the total to
$$\binom{10}{5}-\left(\binom{6}{5}+\binom{6}{5}+\binom{7}{5}+\binom{7}{5}+\binom{6}{5}+\binom{8}{5}\right)$$
However there are now some cases which have been removed twice, where two constraints were broken, which is handled similarly by preallocating enough slots to break two constraints simultaneously, then selecting from the reduced pool. In my example though these are very limited - we could have chosen $3\times$"i"${}+2\times$"r", or $3\times$"n"${}+2\times$"r" - only two options for double constraint breaking (and none for breaking more than two constraints). So we add these $2$ options back in for the result:
$$\binom{10}{5}-\left(3\binom{6}{5}+2\binom{7}{5}+\binom{8}{5}\right)+2$$
Best Answer
Let's distinguish the two $E$s, lets say as $E_1$ and $E_2$. Then we take all possible permutations of the 3 (now distinct letters). If we consider any such permutation, say $E_1 \ S \ E_2$, then there is always another distinct permutation obtained by exchanging the positions of the indexed $E's$ that will give me the same word if I ignore the subscripts $1$ and $2$. So counting every permutation of the three letters has counted every word twice, and so we have to divide by two to get the right answer.
EDIT (in response to the comment):
Suppose you have $10$ numbers as follows: $1,1,3,3,6,6,9,9,15,15$ (think of every number as an arrangement without subscripts).
Now you decide to remove the repetition corresponding to $1,3,6,9,15$. So, as you mention, we resort to subtraction. So out of the 2 copies of every number, we remove one and keep the other. That is the frequency of every number is halved. And so the overall division by $2$.
Have a look at the following for better intuition of why division works: Division using repeated subtraction