[Math] meant by Expectation or Expected value of a Random Variable

expectationprobability

Probability: In terms of Relative frequency.

$S$: Sample Space of an experiment

$E$: Experiment performed.

For each event $E$ of sample space $S$, we define $n(E)$ : no. of times in first $n$ repetitions of the experiment that the event $E$ occurs.

$P(E) = \lim_{n \to \infty} \frac{n(E)}{n}$

It is the proportion of time that Event $E$ occurs.

Is it correct to say that perform the experiment first and then calculate the probable chances of event $E$ to occur depending on the output of our experiment?

Expectation:(What i read from text book) If $X$ is a random variable having a probability mass function $p(x)$ , then expected value of $X$ is:

$E[X] = \sum_{x:p(x)>0}^{}xp(x)$.

What the expectation value of X is describing for X just like probability is describing the proportion of time event $E$ occurs.

$e.g$ $E[X :$ outcome when roll a fair die$]$ = 7/2.
What the 7/2 or 3.5 value signifies?

I am confused between these two. I understand the probability concept but not expectation. It is better if explains using some example?

Best Answer

If you roll a fair die, there are six possible outcomes, $1,2,3,4,5,6$, which are equally likely. The average of these six numbers $(1+2+3+4+5+6)/6 = 7/2$. We might say that "on average", if you roll a die, the outcome should be $7/2$. That is of course absurd for a single die roll, but it becomes increasingly true of the sample mean (i.e., the sum of the rolled values, divided by the number of rolls) if we perform more rolls.

To express this mathematically, suppose we roll the die $N$ times, and call the $n$'th outcome $x_n$, (so $x_n$ is one of $1,2,3,4,5,6$) and compute the sample mean of the $N$ resulting numbers:

$$\frac{1}{N}\sum_{n=1}^{N}x_n$$

we expect the result to be near $3.5$ if $N$ is large.

This can be quantified more precisely by the law of large numbers, which says (stated informally) that the sample mean is increasingly likely to be close to the expected value, as $N$ grows large, and in fact the probability that the sample mean differs from the expected value approaches zero as $N \to \infty$.

Edited to respond to the comment by the OP:

Let's consider rolling the die $2$ times. There are $36$ possible outcomes, as follows:

$$\begin{aligned} (1,1)\qquad(2,1)\qquad(3,1)\qquad(4,1)\qquad(5,1)\qquad(6,1) \\ (1,2)\qquad(2,2)\qquad(3,2)\qquad(4,2)\qquad(5,2)\qquad(6,2) \\ (1,3)\qquad(2,3)\qquad(3,3)\qquad(4,3)\qquad(5,3)\qquad(6,3) \\ (1,4)\qquad(2,4)\qquad(3,4)\qquad(4,4)\qquad(5,4)\qquad(6,4) \\ (1,5)\qquad(2,5)\qquad(3,5)\qquad(4,5)\qquad(5,5)\qquad(6,5) \\ (1,6)\qquad(2,6)\qquad(3,6)\qquad(4,6)\qquad(5,6)\qquad(6,6) \\ \end{aligned}$$ Each outcome is equally likely, with probability $1/36$. Now let's look at the sample mean of the two rolls in each case. So for example, for the first outcome, both rolls were $1$, so the sample mean is $(1+1)/2 = 1$. Computing the sample mean for all $36$ outcomes: $$\begin{aligned} 1\qquad 1.5\qquad 2\qquad 2.5\qquad 3\qquad 3.5 \\ 1.5\qquad 2\qquad 2.5\qquad 3\qquad 3.5\qquad 4 \\ 2\qquad 2.5\qquad 3\qquad 3.5\qquad 4\qquad 4.5 \\ 2.5\qquad 3\qquad 3.5\qquad 4\qquad 4.5\qquad 5 \\ 3\qquad 3.5\qquad 4\qquad 4.5\qquad 5\qquad 5.5 \\ 3.5\qquad 4\qquad 4.5\qquad 5\qquad 5.5\qquad 6 \\ \end{aligned}$$ Notice that some sample mean values are more likely than others.

For example, there is only one outcome, namely $(1,1)$, which results in a sample mean of $1$, and similarly, only $(6,6)$ gives a sample mean of $6$. So the probability of observing a sample mean of $1$ is only $1/36$, and similarly for a sample mean of $6$.

On the other hand, there are three outcomes that give a sample mean of $2$, namely $(3,1)$, $(2,2)$, and $(1,3)$. So the probability of observing a sample mean of $2$ is $3/36 = 1/12$.

Looking at the second table, the most likely observed sample mean is $3.5$. It occurs once in each row, for a total of $6$ out of $36$, so the probability that the sample mean is $3.5$ is $1/6$.

If we were to repeat the experiment with more rolls, we would see that a higher percentage of outcomes would result in sample means near $3.5$, and this percentage would grow closer and closer to $100\%$ as the number of rolls grows larger and larger.

Related Question