Solved – Features for time series classification

classificationfeature selectionsignal processingtime series

I consider the problem of (multiclass) classification based on time series of variable length $T$, that is, to find a function
$$f(X_T) = y \in [1..K]\\
\text{for } X_T = (x_1, \dots, x_T)\\
\text{with } x_t \in \mathbb{R}^d ~,$$
via a global representation of the time serie by a set of selected features $v_i$ of fixed size $D$ independent of $T$,
$$\phi(X_T) = v_1, \dots, v_D \in \mathbb{R}~,$$
and then use standard classification methods on this feature set.
I'm not interested in forecasting, i.e. predicting $x_{T+1}$.
For example, we may analyse the way a person walks to predict the gender of the person.

What are the standard features that I may take into account ?
In example, we can obviously use the mean and variance of the serie (or higher order moments) and also look into the frequency domain, like the energy contained in some interval of the Discrete Fourier Transform of the serie (or Discrete Wavelet Transform).

Best Answer

Simple statistical features

  • Means in each of the $d$ dimensions
  • Standard deviations of the $d$ dimensions
  • Skewness, Kurtosis and Higher order moments of the $d$ dimensions
  • Maximum and Minimum values

Time serie analysis related features

  • The $d \times d-1$ Cross-Correlations between each dimension and the $d$ Auto-Correlations
  • Orders of the autoregressive (AR), integrated (I) and moving average (MA) part of an estimated ARIMA model
  • Parameters of the AR part
  • Parameters of the MA part

Frequency domain related features

See Morchen03 for a study of energy preserving features on DFT and DWT

  • frequencies of the $k$ peaks in amplitude in the DFTs for the detrended $d$ dimensions
  • $k$-quantiles of these DFTs
Related Question