[Math] the Difference in the Average and the Mathematical Expectation in the following Problem

averageexpectationprobabilityrandom variablesstatistics

Suppose that a school has 20 classes: 16 with 25 students in each, three with 100 students in each, and one with 300 students, for a total of 1000 students.

The average class size is simply

$$\bar{x} = 50$$

If we let the random variable $X$ be the class size for the support $S = \{25, 100, 300 \}$. The p.m.f for $X$ is

$$f(x) = \left\{\def\arraystretch{1.2}%
\begin{array}{@{}c@{\quad}l@{}}
\frac{400}{1000} & \text{if $x=25$}\\
\frac{300}{1000} & \text{if $x=100,300$}\\
\end{array}\right.
$$

Thus the mathematical expectation $\mu$ is

$$\mu = 25(400/1000) + 400(300/1000) = 130$$

My questions are as follows: Which average is more correct? Which average would I choose and for what circumstance if they are both correct? Why is there a difference in the first place?

Best Answer

Both averages are calculated correctly. They differ because there are two sampling models being employed -- in one approach we sample a class at random and ask what is the average; in the other approach we sample a student at random and ask what is the average class size the student sees.

As for which number you would choose to represent the 'average' class size at the school, notice that the class-sampling approach makes the class sizes look rather small, while the student-centric approach gives a considerably larger number. If I were a prospective student I would want to know the student-centric average, because this is a more honest representation of the class size that your 'average' student will experience. OTOH the school administrators would love to brag that, according to their calculations, the average class size at the school is 50.

Related Question