Solved – Summary on correctly classfied instances WEKA for a 10-fold cross-validation

cross-validationweka

I ran a 10-fold cross-validation BayesNet (but it could be any method in WEKA), and the output I got was (among other things):

...
=== Stratified cross-validation ===
=== Summary ===

Correctly Classified Instances         142               88.75   %
Incorrectly Classified Instances        18               11.25   %
Kappa statistic                          0.8594
K&B Relative Info Score              13968.9989 %
K&B Information Score                  324.3318 bits      2.0271 bits/instance
Class complexity | order 0             371.9209 bits      2.3245 bits/instance
Class complexity | scheme            13057.6688 bits     81.6104 bits/instance
Complexity improvement     (Sf)     -12685.748  bits    -79.2859 bits/instance
Mean absolute error                      0.045 
Root mean squared error                  0.212 
Relative absolute error                 14.0471 %
Root relative squared error             52.9716 %
Total Number of Instances              160     
...

My questions is related to the Correctly Classified Instances. In this case I got 142 out of 160, but on which run? To my knowledge, a 10-fold cross-validation runs several times. This statistic is for which run?

The best?

The last?

For all of them?

And if it was an average on all runs, don't you guys think that it's a little bit convenient that the number was a nice integer (I ran a lot of classifications and its always an integer).

Best Answer

Cross validation 'runs several times' but it only predicts each case one time.

In your example of 10-fold cross-validation on 160 cases, each of the 10 runs (folds) leaves out 10% of the cases (lets say cases #1-16) to be tested on while training on the remaining 90% (#17-160 cases). The trained model tests on the 16 cases in the hold-out sample, and then the process is repeated on a new hold-out sample (e.g. cases #17-32). This process is repeated until each case has been predicted one time.

The idea is to never use the same case for both the training and testing phase, which can help with problems associated with over-fitting.