[Math] Assumption of a Random error term in a regression

normal distributionprobability distributionsrandom variablesregressionstatistics

In one of my recent statistics courses, our teacher introduced the linear regression model. The typical $y=\alpha + \beta X + \epsilon$, where $\epsilon$ is a "random" error term. The teacher then proceeded to explain that this error term is normally distributed and has a mean zero.

The error term is what is confusing me. What exactly does random mean? My back ground in statistics is very low level, but I understand that a random variable is defined as a mapping from a sample space to the real numbers. This definition makes sense, but the assumption of a zero mean is what I get tripped up on. How can we assume this fact?

I've been trying to think about it intuitively, and can only think that in regards to the real numbers, zero is in a sense "the middle ground" and splits up the reals into 2 "equal length" parts. However, I know that the reals are uncountable so this may be a case where my intuition is incorrect.

I apologize in advance if my question is confusing.

Best Answer

Here's the general idea - someone who has a better background than I do in statistics could probably give a better explanation. So you have this linear regression model: $$Y = \alpha + \beta X + \epsilon $$ where $\epsilon$ follows a normal distribution with mean $0$.

What exactly does random mean? My back ground in statistics is very low level, but I understand that a random variable is defined as a mapping from a sample space to the real numbers. This definition makes sense, but the assumption of a zero mean is what I get tripped up on. How can we assume this fact?

Personally, I've always taken the idea that $\epsilon$ follows a normal distribution with mean $0$ as an axiom of sorts for the linear regression model. My understanding is that it's just something nice we would like the linear regression model to have and lends itself well to certain properties. Remember:

Essentially, all models are wrong, but some are useful.

which is attributed to George E.P. Box.

Why would we want such an axiom? Well... on average, it would be nice to have zero error.

In my honest opinion (this is based off the little measure-theoretic probability I have studied), it would be best to approach this idea of "randomness" intuitively, as you would in an undergraduate probability course.

The idea about anything that is random is that you will never know the value of it. So, in an undergraduate probability class, what you do is you assign probabilities to the values your quality of interest can take by creating a probabilistic model. Your model, 99% of the time, won't be perfect, but that doesn't stop anyone from not trying.

The normal distribution with mean 0 is just an example of a probabilistic model that statisticians feel is a suitable model for the error term. It isn't perfect, but it's suitable for most purposes. I worked with a professor whose focus is on assuming a skew-normal error term, which complicates things, but is usually more realistic, since, in reality, not everything looks like a bell curve.

My two cents. Hopefully I've helped somewhat.

Related Question