MATLAB: How to use the new Support Vector Machine Regression model to simulate the response of new predictors

fitrsvmMATLABmatlab 2015bpredictStatistics and Machine Learning Toolboxsupport vector machinesupport vector machine regression

I’m trying out the new SVM regression capabilities that came with 2015b by following the example from the documentation as much as possible but I don’t fully get it to work. I want to train a SVM regression model on historical data and then feed it new predictors and simulate the response of the target variable. What I have tried is:
%Fit a SVM regression model to data in tbl where all columns are predictors except ‘Target’ which is the response variable.
mdl = fitrsvm(tbl,'Target','KernelFunction','gaussian','KernelScale','auto','Standardize',true);
% Check that the model converged:
conv = mdl.ConvergenceInfo.Converged
% Use the trained model to predict the response of given predictor data:
YFit = predict(mdl, tbl);
So far everything works fine and YFit matched the target data fairly well. However, creating this response is of course pointless since I already have the target data for the data set, what I want to do is give the model new values for the predictors and simulate the response. But when I try to give it that using the same command but with a table containing more predictor data points compared to the training case:
YFit = predict(mdl, NEWtbl); %(NEWtbl is a time extension of the original tbl)
The fit only works for the part of the table that has been used during the fitting, as soon as it goes into predictors that it hasn’t already seen it becomes a horizontal line.
Which commands am I supposed to use to predict the response of unseen data?
Thanks.

Best Answer

This likely means that some variables in the new data have values well outside their ranges in the training data. Think about what the Gaussian kernel means and what response you get from an SVM model when a test point is far away from all points in the training set.
You would have more luck with other models such as say linear SVM in the sense that you wouldn't get a constant prediction for points outside the training set support. Yet you'd have to be very careful interpreting such predictions. SVM and similar models generally require that new data have the same support as the training data do.
Related Question