Solved – Prediction in support vector regression

libsvmpredictionregressionsvmtime series

Has anyone attempted prediction using support vector regression? I'm using LIBSVM, but I'm not sure how to use SVR in either univariate and multivariate time series.

Say we have stock prices for $N$ days. For training inputs, $y$ are the stock prices for $N$ days, but what will we use for $x$?

  1. Time series? For i.e. in one step ahead prediction $1,2,3…Z$ for $Z$ days?
  2. (for one step ahead) sifting one day of $y$ values?

To explain more:

matlab> model = svmtrain(training_label_vector, 
                         training_instance_matrix [, 'libsvm_options']);

For univariate: I use the stock prices for $N$ days in training_label_vector as a column vector and want to predict say next 30 days. I wonder which data I have to use in training_instance_matrix?

For multivariate: say I have 22 more features (prices of other goodies), I use other features as column vectors in training_instance_matrix. But I'm not sure if I'm using the correct approach.

Best Answer

A common approach is to construct some kind of ARMA model. The easiest way to do so is by windowizing the time series with a certain window length N: stock prices at time $k-N$ to $k-1$ are used to predict the stock price at time $k$. You can, ofcourse, include additional parameters for prediction.


As an example, suppose we have the following univariate time series $s$:

1 2 3 4 5 6 7 8 9 $=s[1]..s[9]$

Windowizing using $N=3$ yields: $$\begin{align} \Big[s[k-3],\ s[k-2],\ s[k-1]\Big] &\rightarrow s[k] \\ \begin{bmatrix} 1 & 2 & 3 \\ 2 & 3 & 4 \\ 3 & 4 & 5 \\ \vdots & \vdots & \vdots \\ 6 & 7 & 8 \end{bmatrix} &\rightarrow \begin{bmatrix} 4 \\ 5 \\ 6 \\ \vdots \\ 9 \end{bmatrix} \end{align}$$

In this example, for each discrete moment $k$, we obtain 3-dimensional $\mathbf{x}$ vectors to predict $y=s[k]$. The window length $N$ becomes a tuning parameter which must be optimized, for example using cross-validation.