Random Variable Realisation – Difference Between a Realisation of a Random Variable and the Random Variable Itself

definitionexpected valuenotationprobability

I am having a hard time distinguishing random variables from their realisations. (Please note that for the sake of simplicity of my question I use discrete values in my example below.) Usually, a random variable is shown with an upper case letter such as "$\boldsymbol{X}$" and its realisations are shown with a lower case letter such as "$\boldsymbol{x}$". For instance, if the experiment is to weigh a group of five objects then the random variable $X$ will be a set of five realisations of the weight random variable. So we can write $X_1:\{x_1 = 80, x_2 = 83, x_3 = 67, x_4 = 72, x_5 = 90\}$ in Kg. (Note that I wrote $X_1$ instead of just $X$ to emphasise that this is my first random variable and other random variables will be introduced below, shortly.)

If one would like to add another random variable such as "temperature" of the weighted objects we could, simply, write that as "$X_2$". Let us assume that realisations of the temperature variable (i.e. $X_2$) are $X_2:\{x_1 = 12, x_2 = 30, x_3 = 45, x_4 = 23, x_5 = 9\}$ in Kelvin. And we could go on and add as many random variables as we would like to (e.g. $X_3$: velocity of the objects, $X_4$: colour of the objects and so on where each of the variables will have five realisations). Things are fine until we come across statements such as "If $X_1, X_2, X_3, \dots, X_n$" are $n$ random variables, we find that"

$$E[\sum_{i=1}^{n} X_i] = \sum_{i=1}^{n} E[X_i]$$

I have come across this and similar equations in different textbooks (where there is some kind of a sum over "$n$" random variables)! The problem is that (given my example above) the units of random variables do NOT match so the whole sum does not make sense! Given my interpretation of random variable, I would write the right hand side of the equation above as follows,

$$ \sum_{i=1}^{n} E[X_i] = E[X_1] + E[X_2] + E[X_3] + … + E[X_n].$$

In my example above $n = 2$ so we only have to deal with $E[X_1]+E[X_2]$. And I run into the problem of different random variables having different units (i.e. $E[X_1]$ in Kg and $E[X_2]$ in Kelvin). There is another question relevant to mine here where my interpretation of the answer to that question is that $X_1$, $X_2, \dots, X_n$ are not random variables but realisations! But if that is the case then $E[X] = \mu$ does not make sense! How can expected value of one realisation ($X$, that is a single observation/measurement) equal to the mean of the entire population ($\mu$)? If I were to summarise my question I would ask the followings,

  • What is the difference/connection between $x_i$, $X_i$, and a vector of random variables $\boldsymbol{X}$?

  • What is the connection between random variables {$x_i, X_i, \boldsymbol{X}$} and sample mean $\overline{X}$ and population mean $\mu$?

Best Answer

The realization of a random variable is the value that was observed (though, as noticed in the comments, you can have random variables for non-observable things). For example, you treat the result of throwing a fair dice as a random variable $X$. Say that the result is five dots, $x=5$ is the realization. The “five objects” that you call “realizations” are all random variables that together form multivariate random variable. In this framework, it doesn’t make sense to discuss a single random variable with multiple realizations.

You can throw a dice $n$ times and treat the results as $X_1,X_2,\dots,X_n$ random variables with $n$ observed realizations accordingly. $E[X_1]$ could be an expected value of the random variable for the result of first throw $X_1$, where the realization $x_1$ would be a number, for example, $3$. So

$$ \bar X = \frac{1}{n} \sum_{i=1}^n X_i $$

is a random variable, as a function of $n$ random variables, and in

$$ \bar x = \frac{1}{n} \sum_{i=1}^n x_i $$

$\bar x$ is a realization of $\bar X$ calculated from realizations $x_i$ random variables $X_i$.

A random vector $\mathbf{X} = (X_1,X_2,\dots,X_n)$ is just a shorthand for writing them all each time.

Finally, you would see different notations used by different authors and in different contexts, so each time you need to make sure what is described rather than assuming things from notation alone.

You should probably refresh your knowledge on random variables to make things clearer. Given the multiplicity of issues mentioned by you, I'd recommend also a probability and statistics handbook or leactures.

Related Question