[Math] Probability distribution of the sum of 1 to n random integers

combinatoricsprobability

I apologize in advance if the title of this question was not clear, I do not know how to formalize this question myself, and therefore I need help. I will try to explain the problem as simple as possible, so as not to confuse myself, so please bear with me if it gets redundant.

Assume you have a random number $X$, which only takes on positive integer values, with a discrete probability distribution. The probability of $X$ taking on a value of $x$ is given by $P(X=x)$.

I would like to know the probability distribution $P(S=s)$, where $S$ is the sum of a varying number of consecutive, independent draws of $X$.

To give an example:
If $s=1$, I would like to know the probability of observing an arbitrary number of consecutive draws of $X$ which add up to 1. Here, we can only consider a single draw of $X$, since $X$ takes on only positive integer values (the minimum possible value of $X$ is 1).
In this case (I think)

$P(S=1) \propto P(X=1)$.

If $s=2$, we need to consider 2 cases. A single draw of $X$ with the outcome 2 OR two independent draws of $X$ both with the outcome 1, will both yield a sum of 2. So the probability of having a sum of 2 is given by (I think)

$P(S=2) \propto P(X=2)+P(X=1)^2$.

To make things more clear, I'd like to give examples for higher values of $s$. For simplicity, from here on I'll represent $P(X=x)$ as $p_x$.

$P(S=1) \propto p_1$

$P(S=2) \propto p_2+p_1^2$

$P(S=3) \propto p_3+2 p_1 p_2 + p_1^3$

$P(S=4) \propto p_4 + 2 p_1 p_3 + p_2^2 + 3 p_1^2 p_2 + p_1^4$

$P(S=5) \propto p_5 + 2p_1p_4 + 2p_2p_3 + 3 p_1^2 p_3+ 3p_1p_2^2+4p_1^3p_2+p_1^5$

In particular, I'd like to know:

1) whether I'm making a logical error here in the above formalization. I know the right hand sides above are not equal to the probabilities $P(S=s)$ on the left hand side, because they are not normalized (i.e. the sum of the right hand side terms as $s \rightarrow \infty$ will be bigger than 1), but I believe this can be solved with the proper choice of a normalization constant (or am I wrong?).

2) whether there is a general expression for the above given series on the right hand side. What immediately stands out of course is that the first terms are all $p_s$, which corresponds to the situation in which in a single draw of $X$ we immediately obtain $s$ (let's call it $d=1$). Also, the last terms are all $p_1^s$, which corresponds to the situation in which we obtain the sum $s$ by drawing the minimum possible value of 1 in $s$ independent draws of $X$ (here $d=s$).

As an attempt at obtaining a general expression for the right hand side, I started out by thinking that for each $s$ we need a sum over the number of draws $d$. Since the minimum number of draws we need is 1, and the maximum $d$ we can have is $s$:

$P(S=s) \propto \sum_{d=1}^s \ldots $

If we go back to the example, we also see that the number of terms we add for each value of $d$ follow the binomial coefficient, specifically,
\begin{equation}
P(S=s) \propto \sum_{d=1}^s \sum_{i=1}^{\binom{s-1}{d-1}} \ldots
\end{equation}
where the upper limit for the second sum $\binom{s-1}{d-1}$ is just the binomial coefficient.

What I mean by this representation is that if we consider, for example the case for $s=5$, then for $d=1$ we add one term, $p_5$, and for $d=2$ we add 4 terms, $p_1p_4+p_4p_1+p_2p_3+p_3p_2$, for $d=3$ 6 terms and so on.

This is, I guess, because we can alternatively think of this problem as how to distribute $s$ balls into $d$ bins, with the condition that each bin must get at least 1 ball. What complicates this problem for me is that, I do not just want to know how many possible ways there are in which I can place $s$ balls into $d$ bins, but I'd like to have a series representation in which all the possibilities are spelled out, e.g. for $s=5$ and $d=2$ I'd like to have $p_1p_4+p_4p_1+p_2p_3+p_3p_2$, because I can put 1 ball in bin 1 and 4 in bin 2, or 4 balls in bin 1 and 1 in bin 2, or 2 balls in bin 1 and 3 in bin 2 etc.

At this point, I have attempted several different solutions and repeatedly failed. Any solutions, help, nudges and keywords in the right direction will be very much appreciated. Thank you in advance.

EDIT: I have given some more thought to the question, and I still don't know whether it would be possible to have a closed expression to represent $P(S=s)$ as a function of $p_x$, i.e. to have a general solution for any random variable $X$ with an arbitrary distribution.
However, in general there seems to be solutions for random variables $X$ with discrete uniform or Poisson distributions. Nevertheless, I'd be interested in a general solution as a function of $p_x$ or in any other insights regarding this question. Thank you.

Best Answer

What I understand from the question is that one is given an i.i.d. sequence $(X_n)_{n\geqslant1}$, with positive integer values, and that one is interested in $P(A_s)$ for every $s$, where $$A_s=[\exists n\geqslant0,X_1+X_2+\cdots+X_n=s].$$ Then, the classical approach to compute every $P(A_s)$ in one strike is to consider the series $$u(x)=\sum_{s\geqslant0}P(A_s)x^s,$$ with the convention that $P(A_0)=1$, and to note that $$u(x)=\sum_{n\geqslant0}\sum_{s\geqslant0}P(X_1+X_2+\cdots+X_n=s)x^s=\sum_{n\geqslant0}E(x^{X_1+X_2+\cdots+X_n}).$$ The sequence $(X_n)_{n\geqslant1}$ is i.i.d. hence $$u(x)=\sum_{n\geqslant0}E(x^{X_1})^n=\frac1{1-E(x^{X_1})}.$$ In principle, this identity allows to extract every $P(A_s)$ as the coefficient of $x^s$ in the expansion of the RHS. For example, if $X_1$ is uniform on $\{1,2\}$, then $$E(x^{X_1})=\frac{x+x^2}2,\qquad1-E(x^{X_1})=(1-x)\left(1+\frac{x}2\right),$$ and $$u(x)=\frac{\frac23}{1-x}+\frac{\frac13}{1+\frac12x}=\frac23\sum_{s\geqslant0}x^s+\frac13\sum_{s\geqslant0}\frac{(-1)^s}{2^s}x^s,$$ hence, for every $s\geqslant0$, $$P(A_s)=\frac13\left(2+\frac{(-1)^s}{2^s}\right).$$ Finally, note that, for every aperiodic distribution of $X_1$, one can show that $$\lim_{s\to\infty}P(A_s)=\frac1{E(X_1)}.$$ This framework is called renewal theory, and we just scratched the tip of the iceberg...