Time Series – Understanding PC-Vector Autoregression (PC-VAR)

estimationforecastingpcatime seriesvector-autoregression

When using PC-VAR model for forecasting purposes, can we define it in the following manner?

enter image description here

where a k-dimensional vector of intercepts is denoted by φ0 , Φ represents a k × k matrix of coefficients of the PCs and lagged terms of the response variable, and {αt} represents a sequence of serially uncorrelated random vectors with 0 mean and covariance matrix or vector of innovations.

The dimension of the predictor variables is first reduced through PCA and then 3 PCs will be used to fit the model to forecast the response variable.

Best Answer

The question has two aspects: terminology and validity of the model.

Regarding terminology, searching for PC-VAR I find Morana "PC-VAR Estimation of Vector Autoregressive Models (2012). In that paper (and using its notation), PC-VAR amounts to regressing the original variables $x_t$ on lags of some of their principal components $f_{t-j}$ (let us call it model 1) and then recovering coefficients of a VAR for $x_t$ (let us call that model 2) from the coefficients of model 1. Meanwhile, your PC-VAR is something else. I am not sure how widely accepted the PC-VAR terminology of Morana is, but in any case, when presenting your work it may be helpful to note that your approach is not the same as Morana's.

Regarding validity (in a nontechnical sense), I do not see a problem with doing an autoregression for $Y_t$ where $Y_t$ contains one original variable and several PCs of a large number of other original variables. (I would exclude the one original variable of interest when obtaining the PCs, so that they are obtained from all the other original variables. This would be a must if you were using all of the PCs in the model, as otherwise you would face perfect multicollinearity. It is not a must when you only include a few PCs, but it is nice as it yields a more straightforward interpretation.)

Related Question