Solved – What does it mean by error has a Gaussian Distribution

linear modelmathematical-statisticsregression

Linear Regression assumes that the errors (εi) are random and independent, following a Gaussian distribution with mean of zero.

Equation is given by: $y_i = x_i + \epsilon_i$ , say i = 1…n

Does this mean each error($\epsilon_i$) has Gaussian distribution and the value of εi will any one value from the distribution (assuming the probability of course, the maximum probability of it being zero)?

Does it mean $\epsilon_i$ is a separate random variable?

Or do the collection of εi (from 1 to n) is from a single random variable (assuming a parent ε from where εi are taken) and that parent random variable has a Gaussian distribution?

I would appreciate any explanation. Thank you.

Best Answer

You can look at the assumption as if each $\epsilon_i$ comes from an independent Gaussian density with mean zero and variance $\sigma_i^2 > 0$. That is $\epsilon_i \sim N(0,\sigma_i^2)$, for $i = 1,\dots,n$.

But you can also look at it as if the random vector $\epsilon = [\epsilon_1 \dots \epsilon_n]^T$ is distributed according to a multivariate Gaussian distribution which has for mean the zero vector and for covariance matrix a diagonal matrix D where the diagonal elements of D are $\sigma_1^2, \sigma_2^2, \dots \sigma_n^2$. That is $\epsilon \sim MN(0,D)$.