The ordering of the classes in the value parameter should be deterministic and independent of the samples ordering in the training set.
Either the unique string name or integer identifier for each class is stored in the classes_
attribute of the DecisionTreeClassifier
instance after a call to the fit
method.
In order to get the accuracy of the predication you can do:
print accuracy_score(expected, y_1)
If you want a few metrics, such as, precision, recall, f1-score you can get a classification report:
print classification_report(expected, y_1)
A confusion matrix will tell how many of the samples that were classified are classified according to which label. This will tell you if your classifier confuses some categories.
The functions to get these metrics are independent of the classification model you are using. (So you can easily test an SVM for example)
You should use predict()
since this will give the labels of the classified samples. predict_proba
will give the propability of a sample belonging to a category
I recommend reading a few of the documentation pages:
Best Answer
Here is an example of LassoCV's affect on MSE with varying
eps
andtol
(using the diabetes dataset), for various $\alpha$'s. Note that this is the average MSE (each CV run will have a different MSE):It appears that
eps
has a significant impact for some penalty parameters, but with a large enough penalty it doesn't matter.tol
doesn't seem to play a large role (at least as far as scikit has implement LassoCV).See below for code.