The numbers are counting different things. When you count the number of ways in which $n$ things may be distributed amongst $k$ ‘containers’, the answer depends on whether or not the things are distinguishable from one another, and it depends on whether or not the ‘containers’ are distinguishable from one another. It also depends on whether you require each ‘container’ to receive at least one of the objects.
The number of ways to distribute $10$ distinguishable teachers amongst $5$ distinguishable schools is $5^{10}$, not $10^5$. Teachers and schools are not usually considered to be indistinguishable, so this is the most reasonable answer. In the unlikely event that the teachers (but not the schools) are indistinguishable, there are $\binom{14}{4} = 1001$ ways to distribute them, not $\binom94 = 126$; $\binom94$ is the number of ways to distribute $10$ indistinguishable teachers amongst $5$ schools if each school is required to get at least one teacher. Thus, even assuming that $10^5$ is a careless error for $5^{10}$, the two answers offered in the question correspond to markedly different questions.
If we impose the boldface requirement on the version in which the teachers are distinguishable, the calculation is a bit more complicated. A standard inclusion-exclusion argument yields $$\begin{align*}
\sum\limits_{k=0}^5 (-1)^k \binom{5}{k}(5-k)^{10} &=
5^{10} - \binom51 4^{10} + \binom52 3^{10} - \binom53 2^{10} + \binom54 1^{10}\\
&= 5^{10} - 5\cdot 4^{10} + 10\cdot 3^{10} - 10\cdot 2^{10} + 5\\
&=5,103,000,
\end{align*}$$ significantly fewer than $5^{10} = 9,765,625$.
All of these calculations assume that the schools are distinguishable. If the teachers are distinguishable and the schools are not, and each school is required to get at least one teacher, the answer is $\left\{\begin{matrix}10\\5\end{matrix} \right\} = 42,525$, a Stirling number of the second kind. If one or more of the schools may receive no teacher, the answer is $$\sum\limits_{k=1}^5 \left\{\begin{matrix}10\\k\end{matrix}\right\} = 1+511+9330+34,105+42,525 = 86,472.$$
Finally, if neither teachers nor schools are distinguishable, we’re simply counting the number of partitions of $10$ into exactly $5$ or at most $5$ parts, depending on whether or not each school is to receive at least one teacher. These numbers are small enough to be calculated by direct enumeration. There is one partition of $10$ into one part. There are $5$ partitions of $10$ into $2$ parts: $$\begin{matrix}1+9; & 2+8; & 3+7; & 4+6; &5+5\end{matrix}$$. There are $8$ partitions of $10$ into $3$ parts: $$\begin{matrix}1+1+8; & 1+2+7; & 1+3+6; & 1+4+5\\
2+2+6; & 2+3+5; & 2+4+4; & 3+3+4\end{matrix}$$ There are $9$ partitions of $10$ into $4$ parts: $$\begin{matrix}1+1+1+7; & 1+1+2+6; & 1+1+3+5\\
1+1+4+4; & 1+2+2+5; & 1+2+3+4\\
1+3+3+3; & 2+2+2+4; & 2+2+3+3\end{matrix}$$ And there are $4$ partitions of $10$ into $5$ parts: $$\begin{matrix}1+1+1+1+6; & 1+1+1+2+5\\
1+1+1+3+4; & 2+2+2+2+2\end{matrix}$$
Thus, if neither schools nor teachers are indistinguishable, there are $27$ or $4$ ways to distribute the teachers, depending on whether or not every school must get at least one.
To start from your example, note that the unconitional part is easy, since $N_2 - N_4$ is normal with mean $\mu_2 - \mu_4$ and variance $\sigma_2^2 + \sigma_4^2$. Then for the first conditional probability, you have $P(N_1 - N_2 > 0 | N_2 - N_4 > 0)$. Again, these are both normals of which you can compute the covariance. With that, you should be able to compute the probability. As for the first term in your sum, note that the event is independent of the second conditioning events, so you have a case like the first. I believe this should carry over for larger m and you can throw out most of the conditioning terms. Either way, you still just have normals where you can compute the covariance. And since conditioning on dependent normals just works out to linear projections you should be good. I think that answers it.
EDIT:
For a slightly clearer explanation, $N_1 - N_2$ and $N_2 - N_4$ can be thought of as
$$
\begin{bmatrix}N_1 - N_2\\ N_2 - N_4
\end{bmatrix}
= \begin{bmatrix}1 & -1 & 0 & 0 \\ 0 & 1 & 0 & -1
\end{bmatrix}\begin{bmatrix}N_1 \\ N_2 \\ N_3 \\ N_4
\end{bmatrix}
$$
Note that we can do this for any number of differences. So computing the covariance matrix is no problem. Then, the conditional distribution of the first term given the rest is given here:
https://en.wikipedia.org/wiki/Multivariate_normal_distribution#Conditional_distributions
Ah, my good old friend the Schur Complement. I forget the proof though off the top of my head...
EDIT2:
Ah, I think I may have been a little sloppy. That's just conditional on a random variable. But I think you can still use the same principle since $P(X>0 | Y>0) = \frac{P(X>0, Y>0)}{P(Y>0)}$ which you should be able to get from the joint distribution.
EDIT3:
Since I don't yet have enough reputation to comment on leonbloy's concern, I will post it here. In that example, you have gone from a two dimensional space to a three dimensional space, so the transformation is rank deficient and you get a degenerate covariance matrix in XYZ space.
Best Answer
This is just classic stars and bars problem. The solution is:
$$\binom{n+m-1}{n} = \binom{8+4-1}{8}$$
if it's allowed a a school to have no teacher. While if a school must have at least one teacher the solution is:
$$\binom{k-1}{n-1} = \binom{7}{3}$$
For better explantion on this method you can read about it one the Wikipedia page. It has a really nice explanation.