Solved – How to evaluate the predicted values using Scikit-Learn

predictionpredictive-modelsscikit learn

I am using AdaBoost Classifier to predict values I have. How can evaluate the accuracy of prediction model (I'd like to see how the accuracy of predicted values).

You can check an example here: http://scikit-learn.org/stable/modules/ensemble.html#usage

I found two options : using confusion matrix

from sklearn.metrics import confusion_matrix
cm = confusion_matrix(expected, y_1)

or using cross val score

scores = cross_val_score(clf_1, X_train, y_train)
print scores.mean()

There is also: AdaBoostClassifier.staged_score(X, y) AdaBoostClassifier.score(X, y)

So, I am little bit confused.

One last question: Should I use predict() or predict_proba().

Best Answer

In order to get the accuracy of the predication you can do:

print accuracy_score(expected, y_1)

If you want a few metrics, such as, precision, recall, f1-score you can get a classification report:

print classification_report(expected, y_1)

A confusion matrix will tell how many of the samples that were classified are classified according to which label. This will tell you if your classifier confuses some categories.

The functions to get these metrics are independent of the classification model you are using. (So you can easily test an SVM for example)

You should use predict() since this will give the labels of the classified samples. predict_proba will give the propability of a sample belonging to a category

I recommend reading a few of the documentation pages:

Related Question