Solved – What does “performing PCA on a single time series” mean/do

pcatime series

Yes, I already looked here but that's too high profile for my humble mind (and it's not exactly what I'm looking for).

Imagine we have a timecourse with time on the x-axis and some value on the y-axis (e.g. a signal). Now I can sample this timecourse to obtain a vector in a multidimensional vector space.

My question: What does it mean if I perform a PCA on this data? What is the PCA of a single vector (such as the timeseries) and how can I interpret the resulting eigenvectors?

Best Answer

It isn't meaningful to run PCA on a univariate time series (or, more generally, a single vector). To run PCA on time series data, you'd need to have either a multivariate time series, or multiple univariate time series. There are ways to transform a univariate time series into a multivariate one (e.g. wavelet or time-frequency transforms, time delay embeddings, etc.). For example, the spectrogram of a univariate time series gives you the power at each frequency, for each moment in time.

Say we have a multivariate time series with $p$ dimensions/variables. Or, we might have a set of $p$ univariate time series, where each time point has some common meaning across time series (e.g. time relative to some event). In both cases, there are $n$ time points. There are a couple ways to run PCA:

  1. Consider each time point to be an observation. Dimensions correspond to variables of the multivariate time series, or to the different univariate time series. So, there are $n$ points in a $p$ dimensional space. In this case, eigenvectors correspond to instantaneous patterns across the dimensions/time series. At each moment in time, we represent the amplitude across dimensions/time series as a linear combination of these patterns.

  2. Consider each variable of the multivariate time series (or each univariate time series) to be an observation. Dimensions correspond to time points. So, there are $p$ points in an $n$-dimensional space. In this case, the eigenvectors correspond to temporal basis functions, and we're representing each time series as a linear combination of these basis functions.

Given the above, it's apparent why PCA doesn't make sense for a single univariate time series. Either you have $n$ observations and 1 dimension (in which case there's nothing for PCA to do), or you have a single observation with $n$ dimensions (in which case the problem is completely underdetermined and all solutions are equivalent).