The following solves my problem, but not the question stated above, exactly.
I managed to sort of sidestep the intractable sum above by changing the rules of the generalization a little bit.
Fix the assignment of the balls to the urns. However, instead of pulling out balls from overmatched urns with probability proportional to their weight within the urn, we ONLY select from the LARGEST type of ball. So if there are balls of size 1, 3 and 5 in an urn, the size 1 and 3 balls are never chosen and a size-5 ball is always chosen. WITHIN the largest type, we select at random--if there are 10 size-5 balls, each is chosen with probability 1/10, regardless of the frequencies of the size 1 and 3 balls.
Note that this generalization also simplifies to the simple case when all balls are of equal size.
However, we are able to get a simple expression under this new rule, basically because we set a bunch of the terms above to 0.
Specifically, we find that:
$\begin{eqnarray*}
\hat{P}_{i}(\textbf{n},\textbf{U};M,N) & = & \sum\limits _{k_{1}=0}^{n_{1}}\cdots\sum\limits _{k_{i-1}=0}^{n_{i=1}}\sum_{k_{i}=1}^{n_{i}}\prod\limits _{j=1}^{i-1}\dbinom{n_{j}}{k_{j}}\dbinom{n_{i}-1}{k_{i}-1}\frac{M(M-1)^{\sum\limits _{j=1}^{i}(n_{j}-k_{j})+\sum\limits_{j>i}n_{j}}}{M^{\sum\limits _{j=1}^{N}n_{j}}}\frac{1}{k_{i}}\\
& = & \frac{M\left(M-1\right)^{\sum\limits_{j>i}n_{j}}}{M^{\sum\limits_{j=1}^{N}n_{j}}}\sum_{k_{1}=0}^{n_{1}}\dbinom{n_{1}}{k_{1}}\left(M-1\right)^{n_{1}-k_{1}}\cdots\sum_{k_{i}=1}^{n_{i}}\dbinom{n_{i}-1}{k_{i}-1}\frac{1}{k_{i}}\left(M-1\right)^{n_{i}-k_{i}}\\
& = & \frac{M\left(M-1\right)^{\sum\limits _{j>i}n_{j}}}{M^{\sum\limits _{j=1}^{N}n_{j}}}\sum_{k_{1}=0}^{n_{1}}\dbinom{n_{1}}{k_{1}}\left(M-1\right)^{n_{1}-k_{1}}\cdots\sum_{k_{i-1}=0}^{n_{i-1}}\dbinom{n_{i-1}}{k_{i-1}}\frac{1}{n_{i}}\left(M-1\right)^{n_{i-1}-k_{i-1}}\left(M^{n_{i}}-\left(M-1\right)^{n_{i}}\right)\\
& = & \frac{M\left(M-1\right)^{\sum\limits_{j>i}n_{j}}M^{\sum_{j=1}^{i-1}n_{j}}\left(M^{n_{i}}-\left(M-1\right)^{n_{i}}\right)}{n_{i}M^{\sum\limits_{j=1}^{N}n_{j}}}\\
& = & \boxed{\left(1-\frac{1}{M}\right)^{\sum\limits _{j>i}n_{j}}\left(1-\left(1-\frac{1}{M}\right)^{n_{i}}\right)\frac{M}{n_{i}}}
\end{eqnarray*}$
(The second line is simple rearrangement of terms; the third relies
on the observations that $\dbinom{n_{i}-1}{k_{i}-1}\frac{1}{k_{i}}=\dbinom{n_{i}}{k_{i}}\frac{1}{n_{i}}$
and the Binomial Theorem which implies that $\sum\limits_{k_{1}=1}^{n_{1}}\dbinom{n_{i}}{k_{i}}\left(M-1\right)^{n_{i}-k_{i}}=M^{n_{i}}-\left(M-1\right)^{n_{i}}$--which,
by the way, is the number of ways to end up with at least one size
$i$ ball in a given urn;
the fourth line again uses the Binomial theorem repeatedly (and more
directly); and the fifth line is rearrangement/simplification)
Note that this can be seen as
$\hat{P}_{i}(\textbf{n},\textbf{U};M,N)=M\cdot ABC$
Where $A=\left(1-\frac{1}{M}\right)^{\sum\limits _{j>i}n_{j}}$ is
the probability that all larger balls end up in other urns, $B=\left(1-\left(1-\frac{1}{M}\right)^{n_{i}}\right)$
is the probability of getting at least one size $U_{i}$ ball, and
$C=\frac{1}{n_{i}}$ is the probability of a given ball being selected
conditional on at least one size-$U_{i}$ ball being in the urn (which
can be seen by the binomial expansion used above).
If we fix $\frac{n_{i}}{M}=\frac{1}{\theta_{i}}$ and send $M\rightarrow\infty$,
we get a somewhat familiar expression:
$p=\frac{1}{\theta_{i}}e^{-\sum\limits_{j>i}\frac{1}{\theta_{j}}}\left(1-e^{-\frac{1}{\theta_{i}}}\right)$
I'm still not quite clear how to extend this to a continuum of types ($N \rightarrow \infty$).
Also, if this exposition helps anyone gain some traction on the original expression, let me know.
First, we let $P(j,k):=1+f(\lfloor\tfrac{j}{2^k}\rfloor+1)$ and sum over $n$ of fixed weight $\ell:=\mathrm{wt}(n)$ (like in this answer):
\begin{split}
s(n) &= \sum_{\ell=0}^n \sum_{t_1 + \dots + t_\ell \leq n-\ell}
\sum_{j=0}^{2^\ell-1} (-1)^{\ell-\mathrm{wt}(j)} \prod_{k=0}^{\ell-1} P(j,k)^{t_k+1} \\
&= [x^{n+1}]\ \sum_{\ell=0}^n \sum_{j=0}^{2^\ell-1} (-1)^{\mathrm{wt}(j)+1} \prod_{k=0}^{\ell} \frac{-P(j,k)x}{1-xP(j,k)}.
\end{split}
Now, we notice that $P(j,k)$ depends on runs of unit bits in $j$. Namely, each run of length $u-1$ contributes the term
$$\prod_{v=1}^{u} \frac{-vx}{1-vx} = x^{u} (-1)^{u} u! \prod_{v=1}^{u} \frac1{1-vx}.$$
Hence, introducing the number $z$ of zero bits in $j$ padded with an extra zero at the beginning (and so $\mathrm{wt}(j)=\ell+1-z$), we have
\begin{split}
s(n) &= [x^{n+1}]\ \sum_{\ell=0}^n [y^{\ell+1}]\ \sum_{z\geq 0} (-1)^{\ell+2-z} \left(\sum_{u\geq 1} y^u (-1)^u x^u u! \prod_{v=1}^u \frac1{1-vx}\right)^z\\
&= -[x^{n+1}]\ \sum_{\ell=0}^n (-1)^{\ell+1} [y^{\ell+1}]\ \left(1+\sum_{u\geq 1} y^{u} (-1)^u x^u u! \prod_{v=1}^u \frac1{1-vx}\right)^{-1}\\
&=- [x^{n+1}]\ \left(1+\sum_{u\geq 1} x^u u! \prod_{v=1}^u \frac1{1-vx}\right)^{-1}\\
&=- [x^{n+1}]\ \left(1+\sum_{u\geq 1} x^u u! \sum_{m\geq 0} S(m,u) x^{m-u}\right)^{-1}\\
&=- [x^{n+1}]\ \left(1+\sum_{m\geq 1} x^m B_m^o\right)^{-1},
\end{split}
which can be recognized as INVERTi transform of ordered Bell numbers $B_m^o$.
Best Answer
This is the answer to the first question, I wrote a long answer to Question 2 as a separate answer.
Note that $A:=\sum_{k_i>0,k_1+\dots+k_n=K}\frac{K!}{n!k_1!\dots k_n!} \prod k_i^{k_i-1}$ is a number of forests on the ground set $\{1,2,\dots,K\}$ having exactly $n$ connected components and with a marked vertex in each component ($k_i$ correspond to the sizes of components.) Add a vertex 0 and join it with the marked vertices. Then we have to count the number of trees on $\{0,1,\dots,K\}$ in which $0$ has degree $n$. Remember that the sum of $z_0^{d_1-1}\dots z_K^{d_K-1}$ over all trees on $\{0,\dots,K\}$, where $d_i$ is degree of $i$, equals $(z_0+\dots+z_K)^{K-1}$. Substitute $z_{1}=\dots=z_K=1$ and take a coefficient of $z_0^{n-1}$. It is $\binom{K-1}{n-1}K^{K-n}$.