Solved – XGBoost implementation for unbalanced data using scale_pos_weight parameter

I have a confusion regarding how cost sensitive custom metric can be used for training of unbalanced dataset (two class 0 and 1) in XGBoost.

Metric: Cost = 10*#of false positives + 500*# of false negatives

Can anyone help me understand how exactly the parameter 'scale_pos_weight' is used while training in XGBoost?

Following is my interpretation. Please correct me if I'm wrong.

objective function: binary:logistic

case 1: when scale_pos_weight = 0
In this case both the classes 0 and 1 are treated equally and while updating the parameters of model during training the values for updating model will be same.

case 2: when scale_pos_weight = 60
In this is case the weight for class 1 is 60 time more than for class 0, so while updating the parameters the values for updating model will me more for class 1 than for class 0.

Since eval_metrics do not contribute to training, So even though I use a class sensitive cost, it will not help me unless I use the parameter 'scale_pos_weight'.

Is my interpretation correct?

Solved – XGBoost implementation for unbalanced data using scale_pos_weight parameter

Best Answer

Related Question

Best Answer

Related Solutions

Solved – High Recall – Low Precision for unbalanced dataset

Solved – Reduce false positives with XGBoost

Related Question