I came across the term "Multicollinearity" in statistics, particularly statistics. However, I never really understand mathematically why highly correlated (almost linearly dependent) columns in the design matrix $X$ lead to higher variance of the regression coefficient, given that the $$Var(\hat{\beta}) = \sigma^2 (X^TX)^{-1}$$ ?
Can someone explain to me mathematically the idea behind please ?
Best Answer
Lets start with your formula:
$var(\beta)=\sigma^2(X^TX)^{-1}$
First, note that is the columns of X are linearly dependent, then $X'X$ will not be invertible and have a determinant of 0. And, I won't prove this next part, but if $X'X$ is "close" to being linearly dependent, then the determinant will be "close" to 0.
Now, the inverse of a matrix A is related to the inverse of its determinant:
$A^{-1} = \frac{1}{det(A)}adj(A)$
where $adj(A)$ is the adjugte of the matrix A.
Therefore, as $X'X$ becomes more linearly dependent, its determinant becomes closer to 0, which means the elements of the inverse get larger.