[Math] Expected value and standard deviation when rolling dice.

probability

Example with one dice. Our random variable $X_1$ is defined as a single dice roll. It would follow distribution as:

$$
\begin{array}{r|r}
k & 1 & 2 & 3 & 4 & 5 & 6 \\
\hline
P(X_1=k) & \frac{1}{6} & \frac{1}{6} & \frac{1}{6} & \frac{1}{6} & \frac{1}{6} & \frac{1}{6} \\
\end{array}
$$
Now expected value would be simply calculating weighted arithmetic mean (weighted with probability.)

$$ E[X]=\sum_{i=1}^{6}k_iP(X_1=k) $$
$$ E[X]=3.5 $$
Now the standard deviation would be also pretty straight forward:
$$ SD[X_1]=\sqrt{\left[\sum_{i=1}^6k^2_iP(X_1=k)\right]-\left[\sum_{i=1}^6k_iP(X_1=k)\right]^2 }$$
$$SD[X_1]\approx 1.707825128$$

Now what would be standard deviation and expected value of random variable $M_{100}$ when it's defined as
$$ M_{100}=\frac{1}{100}(X_1+X_2+\dots X_{100}) $$
To my understanding this would be same as values provided for single dice. Since this is basically calculating arithmetic mean of 100 dice rolls. Correct answers would be:

$$ E[M_{100}]=3.5 $$
$$ SD[M_{100}]\approx 1.707825128$$

But am i correct on this on ? I don't have 100% confidence in my answer so if someone could provide some feedback on if this reasoning seems right or not ? In either way any comment will be much appreciated.

Thanks,

Tuki

Best Answer

Use linearity of expectation: $E[M_{100}] = \frac{1}{100}\sum_{i=1}^{100} E[X_i] = \frac{1}{100}\cdot 100 \cdot 3.5 = 3.5$.

The variance is wrong however. $Var[M_{100}] = \frac{1}{100^2}\sum_{i=1}^{100} Var[X_i]$ (assuming independence of X_i) $= \frac{2.91}{100}$.

The random variable you have defined is an average of the $X_i$. It appears that you are thinking right when you are reasoning about the expectation. For the variance however, it reduces when you take average. Heuristically, this is because as you take more and more samples, the fluctuation of the average reduces. This is precisely the intuition behind concentration inequalities such as the Chernoff-Hoeffding bound, and in a way, is what leads you to the Central Limit Theorem as well.

ADDENDUM: $M_{100}$ corresponds to sample mean. The variance of sample mean does depend on the number of samples.

Related Question