I am trying to run an SVM on an imbalanced dataset (0-90%, 1-10%) using the e1071 package, with the radial kernel. I am using cross-validation to select the best gamma and cost. Additionally, I want to use class weights ("0"=1, "1"=10) for every model.
This is the code I am using (similar to the one used in ISLR, only with class weights) with 5 gamma values and 5 cost parameters. Instead of getting 25 models in the output, I am getting 5. The cost parameter is not getting accounted for:
The best model output is the following:
What is the best way to tune the parameters (gamma and cost), including the class weights?
This is my first time running svm. This code took more than 2 days to run. Where am I going wrong?
Best Answer
The call is ignoring the
cost
parameter because it isn't part of the list you passed to ranges. Your call should look like this:A similar examples is shown in the documentation (
?tune
) with theiris
dataset.As for why it is taking so long I don't know how large your dataset is (it may just take a while to process it all) but a
cost
of 1000 is really high. Increasing the cost parameter makes the model more computationally expensive and also increases the risk of losing the ability to generalize your model. I would start with a lower sequence of cost parameters and keep checking to see if you performance continues to go up with increasing the cost parameter making sure to evaluate your model on an independent test set!!!