I am doing a functional principal component analysis on time series data, and when I finished the FPCA on train data and extracted the PCs. Next, I need to project the test data onto the PCs, here I am frustrated how to carry out this process in R.
Here are my steps to deal with the time series data with fda package in R:
-
Construct the basis functions.
create.bspline.basis
-
Smooth basis by
smooth.basis
-
FPCA by
pca.fd
on train dataset
Till the step 3, I gained the score and varprop, but I have no idea how to tranform the test dataset onto the same PCs as in train data.
Thanks for your help in advance.
Best Answer
Projecting new functional data using an existing FPCA analysis is very similar to what we would do with standard PCA (for multivariate data). The main difference is that due to stochastic nature of our sampling procedure we are unable to use standard numerical integration as we would in the case of PCA to get the corresponding score but rather a probabilistic approximation of it (PACE - see reference below).
For rest of the post I will refer to $\phi$ as the functional PCs, $\xi$ as the associated FPC scores, $\lambda$ as their associated eigenvalues, $\mu$ as the sample mean and $C$ as the sample covariance. I also assume we are dealing with irregularly spaced data across a continuum $s$ and I refer to the test data at hand as $y_{test}$. In short, the prediction for the trajectory $y_i(s)$ using the first $K$ eigenfunctions is: $\hat{y}_i^K(s) = \hat{\mu}(s) + \sum_{k=1}^{K} \hat{\xi}_{i,k}\hat{\phi}_k(s)$.
In order to project new test data on the results of an existing FPCA we would require the following steps:
The package
fdapace
implements this methodology through the functionpredict.FPCA
. The packagefda
(most probably) implements this methodology in the functionproject.basis
but I have not used it.