Solved – How to select hyperparameters for SVM regression after grid search

machine learningregressionscikit learnsvm

I have a small data set of $150$ points each with four features. I plan to fit a SVM regression for the reason that the $\varepsilon$ value gives me the possibility of define a tolerance value, something that isn't possible in other regression techniques.

I have run cross-validated grid search on the $\gamma$ and $C$ values, at different values of $\varepsilon$. For different combinations for $\varepsilon$, $\gamma$, and $C$, I receive similar scores (as indicated in grid and results) .

Question: How do I define a criterion on improving the selection of hyper-parameters and making a rational model for my data set?

Results:

**Epsilon = 0.06**
The best parameters are {'C': 48.939009184774889, 'gamma': 0.03562247890262444} with a score of 0.64 

**Epsilon = 0.09**
The best parameters are {'C': 48.939009184774889, 'gamma': 0.03562247890262444} with a score of 0.64 

**Epsilon = 0.11**
The best parameters are {'C': 48.939009184774889, 'gamma': 0.03562247890262444} with a score of 0.66 

**Epsilon = 0.14**
The best parameters are {'C': 48.939009184774889, 'gamma': 0.03562247890262444} with a score of 0.67 

**Epsilon = 0.17**
The best parameters are {'C': 48.939009184774889, 'gamma': 0.03562247890262444} with a score of 0.66 

**Epsilon = 0.19**
The best parameters are {'C': 48.939009184774889, 'gamma': 0.03562247890262444} with a score of 0.65 

**Epsilon = 0.22**
The best parameters are {'C': 48.939009184774889, 'gamma': 0.03562247890262444} with a score of 0.64 

**Epsilon = 0.25**
The best parameters are {'C': 14873.521072935118, 'gamma': 0.00072789538439831537} with a score of 0.65 

**Epsilon = 0.27**
The best parameters are {'C': 621.0169418915616, 'gamma': 0.0038566204211634724} with a score of 0.65 

**Epsilon = 0.30**
The best parameters are {'C': 4175.3189365604003, 'gamma': 0.0012689610031679235} with a score of 0.66

Gridsearch results

Gridsearch results

Best Answer

Though I haven't fully understood the problem, I am answering as per my understanding of the question.

Have you tried including Epsilon in param_grid Dictionary of Grid_searchCV.

I see you have only used the C and gamma as the parameters in param_grid dict.

Then i think the system would itself pick the best Epsilon for you.

Example:

from sklearn.svm import SVR
import numpy as np
n_samples, n_features = 10, 5
np.random.seed(0)
y = np.random.randn(n_samples)
X = np.random.randn(n_samples, n_features)
parameters = {'kernel': ('linear', 'rbf','poly'), 'C':[1.5, 10],'gamma': [1e-7, 1e-4],'epsilon':[0.1,0.2,0.5,0.3]}
svr = svm.SVR()
clf = grid_search.GridSearchCV(svr, parameters)
clf.fit(X,y)
clf.best_params_

output: {'C': 1.5, 'epsilon': 0.1, 'gamma': 1e-07, 'kernel': 'poly'}