Solved – Parameter Search for SVM on the whole data

cross-validationmachine learningsvm

I am trying to implement SVM and i did my parameter selection(grid search) on the whole data and used the best values of C and gamma from that search to test on the testing data.
Sometimes, the cross-validation accuracy, i get is 100%. Is this method wrong? What is the correct way of doing?

How should i go about to do my parameter selection? Shld i divide the data(100%) into 10% for grid search and remaining 90% into training and testing?

Need some guidance on it.. Thanks.

Best Answer

Yes you need to perform the grid search on cross-validated scores. This is well explained in the libsvm guide.

In the libsvm tool/ folder there is a grid.py script to help automate that. Alternatively if you use scikit-learn, you should use the GridSearchCV too that does a similar job:

http://scikit-learn.org/dev/modules/grid_search.html