Solved – Correlation between parameter estimates in regression analysis

covarianceregression

I wonder what cause the correlation between parameter estimates in regression analysis?

Does misspecifacation of model cause such a correlation? Or does multicollinearity cause such a correlation?

In the existence of such correlations between parameter estimates, do we need to search the reason and study on the model to decrease such correlations?

I ask this in the sense of simple regression. But in real, I work with a logit model with categorical dependent variables. And in my situation there is strong correlation between dummy varaible's(in the levels of a categorical variable) parameter estimates.

I will be very glad for any help.Thanks a lot.

Best Answer

The source of correlation between parameter estimates is the finite size of the design matrix.

Consider the OLS parameter covariance matrix estimate: $$\operatorname{Var}[\, \hat\beta \mid X \,] = \sigma^2(X ^T X)^{-1}$$ The design matrix columns are usually correlated, that's normal, there's nothing wrong with it at all.

The formula is for finite sample, which is a very important consideration. Why? Because if you collect infinite number of observations, the diagonals of a term $(X'X)$ will become infinite, making uncertainty completely disappear. When there's no uncertainty, the question of the correlations is moot.

Of course, if the design matrix columns are uncorrelated this will also make parameter estimates uncorrelated, but this is truly a rare situation. Almost all design matrices will have variables correlated.