Solved – What does it exactly mean if a random variable follows a distribution

distributionsnormal distributionrandom variableregression

Imagine there's a random variable such as $ε$. Then we say that $ε$ is i.i.d and follows a normal distribution with mean $0$ and variance $σ^2$.

What does this mean? Is this not a variable anymore? Is this a function now? I see this in most books and such but I'm still unclear what exactly it means or what it does and etc.

In terms of regression, I know this variable is basically the random errors, but what does it mean if this vector of random errors follows a normal distribution?

Best Answer

A random variable $\varepsilon \sim \mathrm{N}(0,\sigma^2)$ is not the kind of variable considered when thinking about function arguments or solving equations, but actually represents the outcome of a random experiment. (Mathematically rigorously, but not so important, one would say: it is a function mapping from a sample space into the space in which the random variable lives.)

How can this be understood? A probability measure, like $\mathrm{N}(0,\sigma^2)$ assigns values to sets, so-called events. In this case, the probability of $\varepsilon$ ending up in a set $A$ has probability $$ \mathrm{N}(0,\sigma^2)(A) = \int_A \frac{1}{\sqrt{2\pi\sigma^2}}\exp\left(-\frac{1}{2\sigma^2} \|x \|^2 \right) \mathrm{d}x. $$ That means, if you repeatedly saw i.i.d. (independent and identically distributed) $\varepsilon$'s, they would (in the large data limit) on average end up in $A$, precisely $\mathrm{N}(0,\sigma^2)(A)\cdot 100 \%$ of the time.