I have a highly unbalanced binary dependent variable (i.e. cases of '1' is <5%). I am trying to implement SMOTE algorithm using R DMwR package. I wonder in general, how we determine the parameters such as perc.over and perc.under indicating how much we need to oversample or undersample the minority or majority class respectively.
Solved – SMOTE algorithm how to select over and under percentage
machine learningrunbalanced-classes
Best Answer
Create a loop so that you can loop through different values of the percentage and see which gives you the best accuracy or f-score. ie 100%, 200% , ... for perc.over. For perc.under you can maj to min ratio multiplied by the inital oversampling percenatge.