Probability – Probability Distribution of a Sum of Uniform Random Variables

probability distributionsrandom variables

Given a random variable
$$X = \sum_i^n x_i,$$
where $x_i \in (a_i,b_i)$ are independent uniform random variables, how does one find the probability distribution of $X$?

Best Answer

The sum of $n$ iid random variables with (continuous) uniform distribution on $[0,1]$ has distribution called the Irwin-Hall distribution. Some details about the distribution, including the cdf, can be found at the above link. One can then get corresponding information for uniforms on $]a,b]$ by linear transformation.

Related Solutions

Probability Theory – Distribution of max(X_i) | min(X_i) for i.i.d Uniform Random Variables

Your claimed result is not true, which probably explains why you're having trouble seeing it.

For simplicity I'll let $a = 0, b = 1$. Results for general $a$ and $b$ can be obtained by a linear transformation.

Let $X_1, \ldots, X_n$ be independent uniform $(0,1)$; let $Y$ be their minimum and let $X$ be their maximum. Then the probability that $X \in [x, x+\delta x]$ and $Y \in [y, y+\delta y]$, for some small $\delta x$ and $\delta y$, is

$$ n(n-1) (\delta x) (\delta y) (x-y)^{n-2} $$

since we have to choose which of $X_1, \ldots, X_n$ is the smallest and which is the largest; then we need the minimum and maximum to fall in the correct intervals; then finally we need everything else to fall in the interval of size $x-y$ in between. The joint density is therefore $f_{X,Y}(x,y) = n(n-1) (x-y)^{n-2}$.

Then the density of $Y$ can be obtained by integrating. Alternatively, $P(Y \ge y) = (1-y)^n$ and so $f_Y(y) = n(1-y)^{n-1}$.

The conditional density you seek is then $$ f_{X|Y}(x|y) = {n(n-1) (x-y)^{n-2} \over n(1-y)^{n-1}} == {(n-1) (x-y)^{n-2} \over (1-y)^{n-1}}. $$ where of course we restrict to $x > y$.

For a numerical example, let $n = 5, y = 2/3$. Then we get $f_{X|Y}(x/y) = 4 (x-2/3)^3 / (1/3)^4 = 324 (x-2/3)^3$ on $2/3 \le x \le 1$. This is larger near $1$ than near $2/3$, which makes sense -- it's hard to squeeze a lot of points in a small interval!

The result you quote holds only when $n = 2$ -- if I have two IID uniform(0,1) random variables, then conditional on a choice of the minimum, the maximum is uniform on the interval between the minimum and 1. This is because we don't have to worry about fitting points between the minimum and the maximum, because there are $n - 2 = 0$ of them.

[Math] Distribution of sum of multiplication of i.i.d. exponential random variables.

In answer to your first question ...

Given $X \sim Exponential(\lambda_1)$ with $E[X] =\lambda_1 $, and $Y \sim Exponential(\lambda_2)$ with $E[Y] =\lambda_2 $, where $X$ and $Y$ are independent. Let:

$$W_i =c (X_i-a) (Y_i-b) \quad \text{and} \quad Z_n = \sum_{i=1}^n W_i$$

Then, by the Lindeberg-Levy version of the Central Limit Theorem:

$$Z_n\overset{a} {\sim }N\big( n E[W], n Var(W)\big)$$

We immediately have: $$E[W] = c \left(\lambda _1-a\right) \left(\lambda _2-b\right)$$

Variance of $W$

The OP attempts to approximate the variance - this is not necessary and causes errors.

By independence, the joint pdf of $(X,Y)$ is $f(x,y)$:

Then, $Var[W]$ is:

where I am using the Var function from the mathStatica package for Mathematica to do the nitty-gritties. All done.

Central Limit Theorem approximation

Here are $100000$ pseudo-random drawings of $Z$ generated in Mathematica, given $n = 200, \lambda_1= 3, \lambda_2 =2,a=2.2,b=4$ ...

zdata = Table[
     xdata = RandomVariate[ExponentialDistribution[1/3], {200}];
     ydata = RandomVariate[ExponentialDistribution[1/2], {200}];
     Total @@ {(xdata - 2.2) (ydata - 4)}, {i, 1, 100000}];

The CLT Normal approximation $N\big(\mu, \sigma^2\big)$ has parameters $\mu = n E[W]$ and $\sigma = \sqrt{n Var(W)}$:

Here, the squiggly BLUE curve is the empirical pdf (from the Monte Carlo data), and the dashed red curve is the Central Limit Theorem Normal approximation. It works very nicely WHEN THE CORRECT variance derivation is used, even with a sample of size $n = 200$.

Central Limit Theorem fit using OP's approximated variance

By contrast, if we use the OP's approximation of Var(Z) to calculate $\sigma$, then the CLT 'fit' is not good at all:

Notes

As disclosure, I should add that I am one of the authors of the software used above.

Best Answer

Related Solutions

Probability Theory – Distribution of max(X_i) | min(X_i) for i.i.d Uniform Random Variables

[Math] Distribution of sum of multiplication of i.i.d. exponential random variables.

Related Question