[Math] Calculating the expected value and variance of $n$ independent observations of $X$

central limit theoremprobabilityrandom variablesvariance

I am attempting to find the expected value and variance of the random variable $X$ analytically (in addition to a decimal answer). $X$ is the random variable expression(100)[-1] where expression is defined by:

def meander(n):
       x = [0]
       for t in range(n):
           x.append(x[-1] + 3*random.random())
       return x

For those that do not understand the Python, $X$ is essentially the sum of a sequence of $100$ values with value of 3*random.random(), where random.random() is uniformly distributed on $[0,1)$.

I am almost certain that I will need to apply the concepts of:
$$\text{mean}\left(\bar{X}\right)=E\left(\frac{1}{n}\left(X_1+X_2+…+X_n\right)\right)=E\left(X\right)$$
$$\text{and}$$
$$\text{var}\left(\bar{X}\right)=var\left(\frac{1}{n}\left(X_1+X_2+…+X_n\right)\right)=\frac{1}{n}\text{var}\left(X\right)$$
$$\text{where, }\bar{X}=\frac{1}{n}\left(X_1+X_2+…+X_n\right)$$

I am having difficulty understand how I should be plugging in this equation and representing it symbolically, let alone calculating it. I created a simulation in order to better understand the distribution of the data (in addition to getting an estimate of the expected value) and it seems to be a Gaussian distribution (histogram of distribution after 100,000 trials). The simulation suggests an estimated expected value of $150.038527551$.

These solutions will culminate in the usage of the Central Limit Theorem in finding an analytical expression that approximates the pdf of $X$.

Any guidance or help to point me in the right direction would be very much appreciated!

Best Answer

So, your random variable is $$ X = 3X_1+\dots+3X_{100} = \sum_{k=1}^n 3X_k $$ with $n=100$, where $X_1,\dots, X_n$ are independent, identically distributed random variables that are uniform in $[0,1)$. In particular, $\mathbb{E}\left[ X_k \right] = \frac{1}{2}$ and $\operatorname{var} X_k = \frac{1}{12}$ for every $1\leq k\leq n$.

By linearity of expectation, you get $$ \mathbb{E}[X] = \mathbb{E}\left[ \sum_{k=1}^n 3X_k \right] = \sum_{k=1}^n 3\mathbb{E}\left[ X_k \right] =\sum_{k=1}^n 3\cdot \frac{1}{2} = n\cdot \frac{3}{2} = 150. $$ (this does not rely on the fact that the $X_k$'s are independent, only on the fact that they all have a well-defined expectation).

By properties of variance (detailed below), crucially relying on the fact that the $X_k$'s are independent, you obtain $$ \operatorname{var}(X) = \operatorname{var}\left( \sum_{k=1}^n 3X_k \right) = \sum_{k=1}^n \operatorname{var}(3 X_k) = \sum_{k=1}^n 9\operatorname{var} X_k =\sum_{k=1}^n 9\cdot \frac{1}{12} = n\cdot \frac{3}{4} = 75 $$ where we used first the fact that "the variance of the sum of (pairwise) independent random variables is the sum of their variances",* and then that $ \operatorname{var}(aY) = a^2 \operatorname{var}(Y)$ for any real number $a$.

(*) Provided the variances are well-defined, i.e. the random variables are in $L^2$.

Related Question