I'm using R to model votes. I've found that the extreme gradient boost xgbTree
algorithm gives nice results. I'm a newbie and don't really know how to interpret the model. My understanding is xgbTree
is simply a gradient boost model, which runs fast.
If I use the caret package, I know there is a nice varImp()
function that shows me the relative importance of features.
I can also compute the MSE to judge the prediction results.
Other than that, I don't have an idea how to interpret the model and judge predicted results. Was hoping someone could help me understand this in a general sense.
m1<-train(votes~.,data=trainset,method="xgbTree")
p1<-predict(m1,newdata=testset)
I found the following information about the model in R, but not sure how to interpret it from a high-level perspective.
>summary(m1)
Length Class Mode
handle 1 xgb.Booster.handle externalptr
raw 68304 -none- raw
xNames 11 -none- character
problemType 1 -none- character
tuneValue 3 data.frame list
obsLevels 1 -none- logical
> m1$bestTune
nrounds max_depth eta
8 100 3 0.3
> print(m1)
eXtreme Gradient Boosting
70001 samples
11 predictor
No pre-processing
Resampling: Bootstrapped (25 reps)
Summary of sample sizes: 70001, 70001, 70001, 70001, 70001, 70001, ...
Resampling results across tuning parameters:
max_depth nrounds RMSE Rsquared RMSE SD Rsquared SD
1 50 1.0130456 0.7681330 0.010918047 0.007353824
1 100 1.0038036 0.7723058 0.010863621 0.007558934
1 150 0.9989381 0.7744915 0.010900582 0.007602353
2 50 0.9851318 0.7806992 0.010356168 0.007234972
2 100 0.9752206 0.7850671 0.010193392 0.007342830
2 150 0.9711467 0.7868439 0.010170191 0.007324614
3 50 0.9723670 0.7863109 0.010728093 0.007085609
3 100 0.9676703 0.7883940 0.010288853 0.006705919
3 150 0.9674846 0.7884925 0.009959612 0.006718457
Tuning parameter 'eta' was held constant at a value of 0.3
RMSE was used to select the optimal model using the smallest value.
The final values used for the model were nrounds = 150, max_depth = 3 and eta = 0.3.
What's the best way to analyze the prediction results besides MSE? Can we compute confidence intervals? etc.?
Best Answer
Ch.8 of "Introduction to Statistical Learning with R" not only discusses gradient boosting but has actual code on using gbm.