Variance of ridge regression estimator

linear regressionregressionregression analysis

These are the facts as I know them. The ridge regression estimator, $\hat{\beta}_R$, is given as;

\begin{equation}
\hat{\beta}_R = \left(X'X + kI \right)^{-1}X'y, \ k \geq 0
\end{equation}

and the variance of $\hat{\beta}_R$ is given as;

\begin{equation}
\text{Var} \left( \hat{\beta}_R \right) = \sigma^2 \left(X'X + kI \right)^{-1}X'X\left(X'X + kI \right)^{-1}
\end{equation}

I have seen that the following is true;

\begin{equation}
\sum_{i=1}^{p} \text{Var} \left( \hat{\beta}_R{_{_i}} \right) = \sigma^2 \sum_{i=1}^{p} \frac{\lambda_i}{\left(\lambda_1 – k \right)^2}
\end{equation}

where $\lambda_i$ are the eigenvalues of $X'X$.

This was presented without any justification. My question is; is this true? If so, how would I show this?

Best Answer

Let $U\Lambda U'=X'X$ be the eigendecomposition of $X'X$, where $\Lambda$ is diagonal and $U$ is orthogonal. Then $X'X + kI = U(\Lambda + kI)U'$, and thus $$\sigma^2 (X'X + kI)^{-1} X'X (X'X + kI)^{-1} = \sigma^2 U(\Lambda + kI)^{-1} \Lambda (\Lambda + kI)^{-1} U',$$ where we have used the fact that $U$ is orthogonal.

$\sum_i \text{Var}((\hat{\beta}_R)_i)$ is the trace of the covariance matrix $\text{Var}(\hat{\beta}_R)$. Using the linearity and cyclic properties of trace along with orthogonality of $U$, $$\operatorname{Tr} (\sigma^2 U(\Lambda + kI)^{-1} \Lambda (\Lambda + kI)^{-1} U') = \sigma^2 \operatorname{Tr}((\Lambda + kI)^{-1} \Lambda (\Lambda + kI)^{-1}).$$ The expression inside the trace is a diagonal matrix with entries $\frac{\lambda_i}{(\lambda_i+k)^2}$.

Related Question