It would be easier to use characteristic functions.
\begin{equation}
CF_{\text{Multinomial}(n,(p_1,...,p_k))}(t_1,...,t_k) = \bigg(\sum_{j=1}^k p_je^{it_j}\bigg)^n
\end{equation}
As the CF of a sum of random variables is a product of their CFs, it is easy to spot that
\begin{equation}
X \sim \text{Multinomial}(n_1+n_2,(p_1,p_2...p_k))
\end{equation}
as the equality of CFs induces equality of distributions and
\begin{equation}
CF_X = CF_{Y_1+Y_2} = CF_{Y_1}CF_{Y_2} = \bigg(\sum_{j=1}^k p_je^{it_j}\bigg)^{n_1}\bigg(\sum_{j=1}^k p_je^{it_j}\bigg)^{n_2} = \bigg(\sum_{j=1}^k p_je^{it_j}\bigg)^{n_1 + n_2}= CF_{\text{Multinomial}(n_1 + n_2,(p_1,...,p_k))}(t_1,...,t_k).
\end{equation}
Your variable $D_{ij}$ is the distance from $x_i$ to $y_j$ in, say, the clockwise direction if you bend the unit interval into a circle. So you want the distribution of the minimum $Y$ of the clockwise distances from $m$ points $x_i$ to $n$ points $y_j$.
This can be determined by distinguishing the cases in which the $x_i$ and the $y_j$ each form $k$ contiguous blocks, with $1\le k\le\min(m,n)$. In such an arrangement, for $Y$ to be $\ge d$ there must be clockwise gaps of length $d$ from an $x$-block to a $y$-block, whereas there’s no constraint on the clockwise gap from a $y$-block to an $x$-block.
Pick one of the points at which to cut the circle into an interval. Then successively remove each of the $k$ gaps of length $d$ by subtracting $d$ from all points that come after it. This moves all the points into an interval of length $1-kd$, and this transformation is bijective: Each configuration with all points in this interval is transformed into a configuration with $k$ gaps of length $d$ by adding such gaps clockwise of the $x$-blocks. The volume of the set of configurations in which the remaining $m+n-1$ points lie within $1-kd$ clockwise of the chosen point is $(1-kd)^{m+n-1}$, so this is also the probability for there to be a gap of length $d$ clockwise of each $x$-block.
To count the cyclic permutations with $k$ contiguous blocks of each type, first arrange the $m$ points $x_i$ cyclically in one of $(m-1)!$ ways, then choose $k$ of the $m$ bins between them in one of $\binom mk$ ways, then distribute $n$ balls into these $k$ non-empty bins in one of $\binom{n-1}{k-1}$ ways, and then assign the $n$ points $y_j$ to these $n$ balls in one of $n!$ ways. In total there are $(m+n-1)!$ cyclic permutations, so the probability to form $k$ groups is
$$
\frac{(m-1)!\binom mk\binom{n-1}{k-1}n!}{(m+n-1)!}=\frac{\binom mk\binom{n-1}{k-1}}{\binom{m+n-1}n}\;.
$$
A more symmetric way to derive the same result is to distribute $m$ balls into $k$ non-empty groups in $\binom{m-1}{k-1}$ ways and $n$ balls into $k$ non-empty groups in $\binom{n-1}{k-1}$ ways and divide by $k$ for cyclicity, whereas the total number of cyclic arrangements of $m$ and $n$ balls of two types is $\binom{m+n}m$ divided by $m+n$ for cyclicity, which also yields
$$
\frac{\frac1k\binom{m-1}{k-1}\binom{n-1}{k-1}}{\frac1{m+n}\binom{m+n}m}=\frac{\binom mk\binom{n-1}{k-1}}{\binom{m+n-1}n}\;.
$$
Putting everything together, we have
$$
\mathsf P(Y\ge d)={\binom{m+n-1}n}^{-1}\sum_{k=1}^{\min(m,n)}\binom mk\binom{n-1}{k-1}\max(0,1-kd)^{m+n-1}\;.
$$
For example, for $m=3$ and $n=4$ this is
$$
\mathsf P(Y\ge d)=\frac1{15}\left(3\max(0,1-d)^6+9\max(0,1-2d)^6+3\max(0,1-3d)^6\right)\;.
$$
Here’s a plot that compares this cumulative distribution function with the corresponding empirical distribution function from a simulation with $1000$ trials, with good agreement:
It turns out that this yields a rather simple expression for the expected value of $Y$:
\begin{eqnarray*}
\mathsf E[Y]
&=&
\int_0^1\mathsf P(Y\ge d)\,\mathrm dd
\\
&=&
\int_0^1{\binom{m+n-1}n}^{-1}\sum_{k=1}^{\min(m,n)}\binom mk\binom{n-1}{k-1}\max(0,1-kd)^{m+n-1}\mathrm dd
\\
&=&
{\binom{m+n-1}n}^{-1}\sum_{k=1}^{\min(m,n)}\binom mk\binom{n-1}{k-1}\int_0^\frac1k(1-kd)^{m+n-1}\mathrm dd
\\
&=&
{\binom{m+n-1}n}^{-1}\sum_{k=1}^{\min(m,n)}\binom mk\binom{n-1}{k-1}\frac1{k(m+n)}
\\
&=&
\frac1{mn}{\binom{m+n}n}^{-1}\sum_{k=1}^{\min(m,n)}\binom mk\binom nk\;.
\end{eqnarray*}
By Vandermonde’s identity we have
$$\sum_{k=0}^{\min(m,n)}\binom mk\binom nk=\binom{m+n}n\;,$$
so adding and subtracting the $k=0$ term yields
$$\mathsf E[Y]=\frac1{mn}\left(1-{\binom{m+n}n}^{-1}\right)\;.$$
For $n=1$ (and analogously for $m=1$) this reduces to the standard result
\begin{eqnarray*}
\mathsf E[Y]
&=&
\frac1m\left(1-{\binom{m+1}1}^{-1}\right)
\\
&=&
\frac1{m+1}\;.
\end{eqnarray*}
Best Answer
Covariance and correlation are very tricky. They often suggest things that sound true, and are indeed often true, but are not universally true. E.g. in your context, here is a counter-example where $E[X] > E[Y]$.
(BTW, you conjectured that $E[Y] > E[X]$ but that has no hope to begin with. Say all possible values of $Y_1 <$ all possible values of $Y_2$, then $X = X_1, Y=Y_1$ and clearly $E[Y] = E[X]$. So the most you can hope for is $E[Y] \ge E[X]$. But as the following counter-example shows, even that can be violated.)
$Y_1, Y_2$ are i.i.d. and take values $\{0, 1, 2\}$ with equal prob $1/3$ each.
$E[Y_i] = 1$
$E[Y] = \frac19 ( 1 + 1 + 1 + 2) = \frac59$
$(X_1, X_2)$ are jointly distributed as follows, for some $0 < p < \frac12 < q < 1$ with $p+q=1$:
$(0,2)$ with prob $q/3$
$(1,1)$ with prob $1/3$
$(2,0)$ with prob $q/3$
$(2,2)$ with prob $p/3$
It is easy to verify that $X_i$ has the same (marginal) distribution as $Y_i$
$Cov(X_1,X_2) = E[X_1 X_2] - E[X_1]E[X_2] = \frac13 (1 + 4p) - 1 < 0$ since $p < \frac12$
So the precondition (negative correlation) is satisfied. It remains to calculate:
Now for any $p \in (\frac13, \frac12), E[X] = \frac13 (1+2p) > \frac13 (1 + \frac23) = \frac59 = E[Y]$. QED
Further thoughts: Since $E[X_i] = E[Y_i]$ and $Y_1,Y_2$ are independent, the requirement that $Cov(X_1, X_2) < 0$ is equivalent to:
$$E[X_1 X_2] < E[X_1]E[X_2] = E[Y_1]E[Y_2] = E[Y_1 Y_2]$$
So you are basically conjecturing that
$$E[X_1 X_2] < E[Y_1 Y_2]\implies E[\min(X_1,X_2)] \le E[\min(Y_1, Y_2)]$$
But viewed this way, it hardly seems like a reasonable conjecture at all. The product of two variables has not much to do with the minimum of the same two variables. In a sense, there should be a lot of "freedom" to choose pairs of variables (even constrained as you described) so that one pair has the higher $E[\text{product}]$ while the other pair has the higher $E[\text{minimum}]$. And this ultimately points to the fact that covariance, while suggestive, leaves a lot of freedom between the two variables.