Probably a simple general formula for your matrix $C_n=(C_{\lambda,\mu})_{\lambda,\mu\in\mathcal P_n}$, where $\mathcal P_n$ denotes the partitions of $n$ (ordered in decreasing lexicographic ordering) does not exist. However a number of things can be said, notably your above guesses can be confirmed. One thing that your examples suggest but which is false is that the matrix is "upper-left triangular", which fails from $n=6$ on; the reason is that the true triangularity is given by $ C_{\lambda,\mu}\neq0\implies \lambda^{tr}\leq\mu$ in the dominance order where $\lambda^{tr}$ is the transpose (conjugate) partition of $\lambda$, and lexicographic order for $n\geq6$ does not assign complementary positions to $\lambda$ and $\lambda^{tr}$ (nor does any ordering for $n$ sufficiently large).
As you guessed it is easier to study the inverse matrix $C_n^{-1}=(M_{\lambda,\mu})_{\lambda,\mu\in\mathcal P_n}$, whose entry $M_{\lambda,\mu}$ gives the coefficient of $m_\mu$ in $e_\lambda$. This nonnegative number equals number of $0$-$1$ matrices with row sums $\lambda_1,\lambda_2,\ldots$ and column sums $\mu_1,\mu_2,\ldots$, and in particular $M$ is a symmetric matrix (Stanley EC2, Proposition 7.4.1, Corollary 7.4.2). The argument for this combinatorial interpretation is basically the balls-into-boxes you suggested: each term in $e_\lambda$ comes from choosing a monomial in each factor $e_{\lambda_i}$, and this choice can be represented by putting in row $i$ a $1$ (ball) in the selected columns and zeros elsewhere; the product monomial is found by summing up the columns (and by symmetry only those monomials with weakly decreasing exponents (column sums) need to be considered). Since $M_n$ is symmetric and $C_n$ is its inverse, an easy argument shows that $C_n$ is also symmetric.
You suggested that the sum of all entries of $C_n$ is $0$. This fails for
$n=0,1$, but is true for all larger values, which can be seen as follows. By
the definition of $C_n$, if one takes its column sums (equivalently, if one
left-multiplies by $(1~1\cdots1)$), then one gets the
coefficients the $e_\mu$ in $h_n=\sum_{\lambda\in\mathcal P_n}m_\lambda$, the
$n$-th complete homogeneous symmetric function (sum of all distinct monomials
of degree $n$). One would like to know the sum of those coefficients. Now the
relation between elementary and complete homogeneous symmetric functions is
given by the generating series identity
$$
(1-e_1X+e_2X^2-e_3X^3+\cdots)(1+h_1X+h_2X^2+\cdots)=1
$$
or equivalently $\sum_{i=0}^n(-1)^ne_ih_{n-i}=0^n$. You can prove that the
mentioned sum is $0$ for $n\geq2$ by induction from the latter equation. A more
conceptual way is to use the generating series identity and the algebraic
independence of the $e_i$ for $i\geq1$, which means there is a ring morphism
sending all of them to $1$, and the obvious fact that
$$
(1-X+X^2-X^3+\cdots)^{-1}=1+X
$$
This shows that the mentioned morphism sends $h_0$ and $h_1$ to $1$ and all
other $h_i$ to $0$; this value is precisely the sum of coefficients of all
$e_\lambda$ we were after.
Finally for the computation of the matrix $C_n$, it also seems best to view it
as the inverse of the matrix $(M_{\lambda,\mu})_{\lambda,\mu\in\mathcal P_n}$,
which becomes unitriangular after permuting its columns according to the
transposition of partitions. However $M_{\lambda,\mu}$ is best computed not by
counting $0$-$1$ matrices (their numbers grow up to $n!$ in the bottom-right
corner), but rather via
$$
M_{\lambda,\mu}=
\sum_{\lambda\leq\nu\leq\mu^{tr}}K_{\nu,\lambda,}K_{\nu^{tr},\mu}
$$
where $K_{\nu,\lambda}$ designates a Kostka number, the number of
semi-standard Young tableaux of shape $\nu$ and weight (content) $\lambda$.
This identity is proved bijectively by the asymmetric RSK-correspondence,
a bijective correspondence between $0$-$1$ matrices and pairs of
semi-standard tableaux of transpose shapes, their weights being given
respectively by the column sums and the row sums of the matrix. The Kostka
numbers involved are generally quite a bit smaller than $M_{\lambda,\mu}$, and
moreover there are ways to compute them without counting; one method is to
interpret them as weight multiplicities for $GL_n$ and use a formula like
Freudenthal's recurrence for them. (The LiE program which I maintain does so,
and will do this computation easily; one can get results online up to $n=9$ from
this site, if one takes into account the fact that a partition is represented by the
sequence of differences of successive parts: $[4,2,2,1,0,0,0,0,0]$ is
represented as $[2,0,1,1,0,0,0,0]$.)
One could either compute $M_{\lambda,\mu}$ this way and invert the result, or
invert the matrix of Kostka numbers and deduce from above formula, which can
be interpreted as a matrix product, the formula
$$
C_{\lambda,\mu}=
\sum_{\lambda^{tr}\leq\nu\leq\mu}K'_{\lambda,\nu^{tr}}K'_{\mu,\nu}
$$
where $(K'_{\lambda,\mu})_{\lambda,\mu\in\mathcal P_n}$ is the inverse of the
Kostka matrix $(K_{\lambda,\mu})_{\lambda,\mu\in\mathcal P_n}$. I don't know
very much about these "inverse Kostka numbers", but you will find some
information about them in the answers to this MO question; I'm not sure this allows them to be computed more efficiently than by inverting
the Kostka matrix.
Yes you are right that the summations are not properly indexed, so let's make the notations clear.
For $(1-x^{n+1})^m$, the power is a positive integer, so we can use the usual binomial theorem to get
$$(1-x^{n+1})^m=\sum_{i=0}^m(-1)^i{m\choose i}x^{(n+1)i}.$$
For $(1-x)^{-m}$, the power is a negative integer, so we use the negative binomial theorem to get
$$(1-x)^{-m}=\sum_{j=0}^\infty{m+j-1\choose j}x^j.$$
Note that the above sum is to $\infty$ and we need $|x|<1$. Now multiply these two sums, we have
$$\left(\frac{1-x^{n+1}}{1-x}\right)^m=\sum_{i=0}^m\sum_{j=0}^\infty(-1)^i{m\choose i}{m+j-1\choose j}x^{(n+1)i+j}.$$
Therefore, the coefficient for $x^k$ is given by
$$\sum_{\substack{0\leq i\leq m,\ j\geq0\\(n+1)i+j=k}}(-1)^i{m\choose i}{m+j-1\choose j}.$$
As for the usual solution mentioned by the OP, I believe it is to directly expand $(1+x+\cdots+x^n)^m$ using the usual binomial theorem, and then find the coefficient of $x^k$, but this could be tedious since we need to iterate over $n$. In comparison, the OP's method is to use the sum formula of geometric progressions $1+\cdots+x^n=(1-x^{n+1})/(1-x)$, so that we only have the product of two simple sums without needing to iterate over $n$.
Best Answer
A recursive formula is fairly easy, as you have
$$f(n,1) = \frac{n(n+1)}{2}$$ $$f(n,n) = n!$$ $$f(n,p) = nf(n-1,p-1)+f(n-1,p)$$
If you start with $f(0,0)=1$ and $f(0,p)=0$ for $p \not = 0$ then you can reduce this to just $f(n,p) = nf(n-1,p-1)+f(n-1,p)$; note that you then get $f(n,0)=1$
These are essentially unsigned Stirling numbers of the first kind and you have $f(n,p)=\left[{n+1 \atop n-p}\right]$