Regression – Why Use T Distribution for Hypothesis Testing a Linear Regression Coefficient?

hypothesis testinglinear modelregressiont-distribution

In practice, using a standard T-test to check the significance of a linear regression coefficient is common practice. The mechanics of the calculation make sense to me.

Why is it that the T-distribution can be used to model the standard test statistic used in linear regression hypothesis testing? Standard test statistic I am referring to here:

$$
T_{0} = \frac{\widehat{\beta} – \beta_{0}}{SE(\widehat{\beta})}
$$

Best Answer

To understand why we use the t-distribution, you need to know what is the underlying distribution of $\widehat{\beta}$ and of the Residual sum of squares ($RSS$) as these two put together will give you the t-distribution.

The easier part is the distribution of $\widehat{\beta}$ which is a normal distribution - to see this note that $\widehat{\beta}$=$(X^{T}X)^{-1}X^{T}Y$ so it is a linear function of $Y$ where $Y\sim N(X\beta, \sigma^{2}I_{n})$. As a result it is also normally distributed, $\widehat{\beta} \sim N(\beta, \sigma^{2}(X^{T}X)^{-1})$ - let me know if you need help deriving the distribution of $\widehat{\beta}$.

Additionally, $RSS \sim \sigma^{2}\chi^{2}_{n-p}$, where $n$ is the number of observations and $p$ is the number of parameters used in your regression. The proof of this is a bit more involved, but also straightforward to derive (see proof here Why is RSS distributed chi square times n-p?).

Up until this point I have considered everything in matrix/vector notation, but let's for simplicity use $\widehat{\beta}_{i}$ and use its normal distribution which will give us: \begin{equation} \frac{\widehat{\beta}_{i}-\beta_{i}}{\sigma\sqrt{(X^{T}X)^{-1}_{ii}}} \sim N(0,1) \end{equation}

Additionally, from the chi-squared distribution of $RSS$ we have that: \begin{equation} \frac{(n-p)s^{2}}{\sigma^{2}} \sim \chi^{2}_{n-p} \end{equation}

This was simply a rearrangement of the first chi-squared expression and is independent of the $N(0,1)$. Additionally, we define $s^{2}=\frac{RSS}{n-p}$, which is an unbiased estimator for $\sigma^{2}$. By the definition of the $t_{n-p}$ definition that dividing a normal distribution by an independent chi-squared (over its degrees of freedom) gives you a t-distribution (for the proof see: A normal divided by the $\sqrt{\chi^2(s)/s}$ gives you a t-distribution -- proof) you get that:

\begin{equation} \frac{\widehat{\beta}_{i}-\beta_{i}}{s\sqrt{(X^{T}X)^{-1}_{ii}}} \sim t_{n-p} \end{equation}

Where $s\sqrt{(X^{T}X)^{-1}_{ii}}=SE(\widehat{\beta}_{i})$.

Let me know if it makes sense.

Related Question