Solved – Meaning of standard error of the coefficients in a regression model

least squaresrregressionstandard error

Recall the model for simple linear regression
$$
y_i = \beta_0 + \beta_1 x_i + \varepsilon_i.
$$

I am reading up on the standard error of the coefficients $\beta_0$ and $\beta_1$. As an experiment I generated some linear data using $\beta_0 = 1$ and $\beta_1 = 2$ and added some Gaussian noise with unit variance. So then when I fit the data the lm function and used the summary function to examine the model I have the following output:
\begin{align}
\hat \beta_0 & = 1.21054 \quad \text{with Std. Error} = 0.11508, \\
\hat \beta_1 & = 1.87723 \quad \text{with Std. Error} = 0.09844.
\end{align}

So how do I interpret the standard error values? For instance, take $\hat \beta_0$, precisely what is $0.11508$ telling me?

Obviously if I ran the simulation a second time, this time adding Gaussian noise with a higher amount of variance, the standard error would increase as the extra variance in the noise shows up as an increase in the standard error of the coefficients. But, if we consider the first simulation in isolation, then what does this value of $0.11508$ mean?

Best Answer

The standard error is the square root of an estimate of the sampling variability of $\hat\beta_j$ as an estimator of $\beta_j$, or $\sqrt{\widehat{Var}(\hat\beta_j)}$.

As this is many things in one sentence, step-by-step:

  1. "Square-root": should be self-explanatory, to turn a variance into a standard deviation (that turns out to be what we need in, for example, t-statistics and confidence intervals).
  2. "$\hat\beta_j$ as an estimator of $\beta_j$": we use the LS estimator to estimate the unknown parameter $\beta_j$.
  3. To do so, we make use of a sample from the underlying population. Had we drawn another sample (or were to draw a fresh one tomorrow, etc.) we would get another estimate $\hat\beta_j$. This is the source of sampling variability. We may summarize that variability through the variance, $Var(\hat\beta_j)$. An expression for this variance may be found, e.g., here.
  4. "An estimate of the sampling variability": $Var(\hat\beta_j)$ depends on unknown quantities (like the variance of the Gaussian noise that you generated), which must therefore be estimated, as captured by the formula $\widehat{Var}(\hat\beta_j)$. A formula for this estimator is, for example, given here, or, more introductory, here.