Solved – Principal Components Regression: Transform 95% CIs back to original space

I have a set of predictors that clearly suffer from some amount of multicollinearity, so I am using PCA to make the columns of X orthogonal. I am also using this as a way to regularize the subsequent regression by removing components that account for ~0% of the variance.

For example, if I run OLS regression on PCA-transformed data that has 8 predictors, I am then able to use the eigenvectors from the original PCA transformation to get the beta weights for the original 12 predictors. So far, so good.

However, to be able to evaluate the contributions of these predictors to the model fit, I'd like to transform the 95% confidence intervals back into the original space of the 12 predictors. That way, I can use the overall R^2 and associated p-values for the full model to find regressions that are significant, where specific predictors have non-zero contributions.

It is unclear to me how to transform the 95% confidence intervals. If that's not possible, is there another way to evaluate the significance of specific predictors in the original space?

Thanks

Best Answer

Do I understand correctly:

PCA calculates $p$ scores $\mathbf T^{(n \times p)}$ from data $mathbf X^{(n \times m)}$ using the transpose of the loadings $\mathbf P^{(p \times m)}$:

$\mathbf T = \mathbf X \mathbf P^T$

then the OLS calculates some $Y^{(n \times 1)}$ using coefficients $\beta^{(p \times 1)}$:

$Y = \mathbf T \beta$, together:

$Y = \mathbf X \mathbf P^T \beta = \mathbf X \mathbf B$ with

$\mathbf B^{(m \times 1)} = \mathbf P^T \beta$

And now you want to have some indication of the variance of $\mathbf B$?

First of all, in order to get confidence intervals for $\mathbf B$ you need to consider both the PCA and the regression.

Calculating confidence intervals for $\beta$ alone doesn't make sense: PCA is not a projection that is unique, i.e. the axes can flip without notice. In addition for your PCR model, in the $p$-dimensional space of the retained PCs you can also have rotations which do not affect the predictions if $\beta$ changes accordingly.
I suspect that not taking care of these equivalence rules (= restrictions/contraints) is what causes the $\pm \infty$ range in @whuber's comment.

I think of this as: what happens to my model if I acquire a new data set and fit a new model. The models can be equivalent (having the same $\mathbf B$), but different loadings $\mathbf P$ and regression coefficients $\beta$.

Now I have no idea how to get confidence intervals for the PCA, and then how to combine these two given the equvalence contraints. I usually go a much easier way:

I bootstrap $\mathbf B$ during a resampling (out-of-bootstrap) validation.

(So far I don't need confidence intervals for $\mathbf B$, for my purposes the distribution of observed $\mathbf B$s over the bootstrapping is good enough - I need "hard numbers" only on the predictive power)

Best Answer

Related Solutions

Regression – Methods for Back-Transforming Regression Coefficients

Regression – Principal Components Regression: Understanding Coefficients in Terms of Original Regressors

Related Question