Solved – Prove that the variance of the ridge regression estimator is less than the variance of the OLS estimator

biasleast squaresridge regressionself-studyvariance

Consider the following linear model under classical Gauss-Markov assumtions:
$$Y = X\beta + e$$
where $\mathbb{E}X'e = 0$

Consider the following estimator
$$\tilde\beta = \left(\sum_{i=1}^{N}x_ix_i' + \lambda I_k\right)^{-1}\left(\sum_{i+1}^Nx_iy_i\right)$$
where $x_i$ is a column vector $k\times1$ from $X$ and $\lambda > 0$ is a scalar and $\mathbb{E}(x_ie_i) = 0$ .

  1. Define bias and show that $\tilde\beta$ is biased.
  2. Define consistency and show that $\tilde\beta$ is consistent.
  3. Define conditional variance of $\tilde\beta$. Show that conditional variance of $\tilde\beta$ is smaller then the conditional variance of OLS estimator $\hat\beta$.
  4. Give two reasons why we want to prefer using $\tilde\beta$ instead of $\hat\beta$. (Hint: think of collinearity).

First two questions are answered (with the help of Cross Validated).

Define $\left(\sum_{i=1}^{N}x_ix_i' + \lambda I_k\right)^{-1} = (X'X + \lambda I)^{-1} = W$. Also note that under homoskedasticity $Var(\hat\beta) = \sigma^2(X'X)^{-1}$.
For the third one I have
\begin{equation}
\begin{aligned}
Var(\tilde\beta|X) &= Var(WX'Y|X) \\
& = WX'Var(Y|X)XW \\
& = WX'Var(X\beta + u|X)XW \\
& = WX'Var(u|X)XW \\
\text{(assuming homoskedasticity)}& = WX'\sigma^2IXW \\
& = \sigma^2WX'XW
\end{aligned}
\end{equation}

Now to end with question 3 I need to show that $(X'X)^{-1} – WX'XW$ is positive semidefinite. This is the place where I am stuck. I also have no ideas on question 4.


EDIT: please note that this is question from the last years exam which almost surely means that the question can be solved using basic matrix algebra and not more advanced technics like SVD etc.

Best Answer

According to the cardinal's hint. We want to show that $(X'X)^{-1} - WX'XW$ is psd. Denote $X'X = S$. Then, $S^{-1} - WSW = WW^{-1}S^{-1}W^{-1}W - WSW= W(W^{-1}S^{-1}W^{-1} - S)W$. Take expression in the brackets and simplify \begin{equation} \begin{aligned} W^{-1}S^{-1}W^{-1} - S &= (S+\lambda I)S^{-1}(S+\lambda ) - S \\ & = SS^{-1}S + SS^{-1}\lambda +\lambda S^{-1}S + \lambda^2S^{-1} - S\\ & = 2\lambda I + \lambda^2S^{-1} \\ & = \lambda(2I + \lambda S^{-1}). \end{aligned} \end{equation} Since $S^{-1}$ is psd then the whole expression is psd matrix.

Then original expression can be represented as $$W(W^{-1}S^{-1}W^{-1} - S)W = \lambda W(2I + \lambda S^{-1})W$$ which must be psd since expression in the brackets is positive semi-definite matrix.

Related Question