Probability Theory – Cumulative Distribution Function of a Random Variable

combinatoricsprobabilityprobability distributionsprobability theorystatistics

Suppose that we have $n$ independent random variables, $x_1,\ldots,x_n$ such that each $x_i$ takes value $a_i$ with success probability $p_i$ and value $0$ with failure probability $1-p_i$ ,i.e.,

\begin{align}
P(x_1=a_1) & = p_1,\ P(x_1=0)= 1-p_1 \\
P(x_2=a_2) & = p_2,\ P(x_2=0) = 1-p_2 \\
& \vdots \\
P(x_n=a_n) & = p_n,\ P(x_n=0)=1-p_n
\end{align}

where $a_i$'s are positive Real numbers.

What would be the CDF of the sum of these random variables? That is, what would be $P(x_1+\cdots+x_n\le k)$ ? and how can we find it in an efficient way?

Best Answer

This will be CDF of discrete distribution with certain number of values $s_1,...,s_r$ where $r$ is number of possible distinct sums among all subsets $\{a_{i_1},...,a_{i_m}\}$. It is not possible to evaluate $r$ in general case because this number solely depends on particular values $a_1,...,a_n$ and can vary from $n+1$ to $2^n$. The minimal $r=n+1$ is represented by case of $a_1=a_2=...=a_n$ while maximal $r=2^n$ is always achieved when ascending reordered set $\{a_{(1)},...,a_{(n)}\}$ complies with the condition: $\forall i: \sum\limits_{1\le j<i}a_{(j)}\le a_{(i)}$. Example of such maximal case could be $a_i=2^{i-1}$. Note, that failing this condition does not necessarily mean $r<2^n$. For example, for $a_i=ln(prime(i))$ the condition above fails, however all possible sums $a_{i_1}+...+a_{i_m}$ are still distinct, causing $r=2^n$.

We can always consider set $\{s_1,...,s_r\}$ as ordered, then $s_1=0$ and $s_r=\sum\limits_i a_i$. Then $$P(s_1)=P(0)=\prod\limits_i (1-p_i)$$ as all $x_i=0$; $$P(s_r)=P\left(\sum\limits_i a_i\right)=\prod\limits_i p_i$$ as all $x_i=a_i$; and $$P(s_k)=\sum\limits_{i_1,...,i_m|a_{i_1}+...+a_{i_m}=s_k}\left(\prod\limits_{j\in\{i_1,...,i_m\} } p_j\times\prod\limits_{j\in[1;n]\setminus\{i_1,...,i_m\} } (1-p_j)\right)=\\P(0)\times\sum\limits_{i_1,...,i_m|a_{i_1}+...+a_{i_m}=s_k}\left(\prod\limits_{j\in\{i_1,...,i_m\} }\frac{p_j}{1-p_j}\right)$$ as we can eliminate the second product of complemented set $[1;n]\setminus\{i_j\}$ by using the fact that $P(0)$ contains all $(1-p_i)$ for the whole set $[1;n]$.

Related Question