Solved – Understanding factors returned by RPart classification

rrpart

This is a very basic question on using R for classification. I'm trying to use rpart for classification task and would like to have a class label as a result, i.e. I use type="class" in predict method

predict(tree, data, type = "class")

The class labels are "0" and "1". When I try to classify a single observation I get something like this:

1 
0 
Levels: 0 1

As I understand this is a factor with 2 levels: 0 and 1. However, why there are 2 numbers reported and what is exactly the class assigned to this observation?

Best Answer

You didn't show a dataset to replicate. You can see from a basic illustration that class typically only returns one class per prediction. Maybe in your case, 1 is the observation index and 0 is the prediction with 2 corresponding factor levels possible {0,1}.

   > predict(fit, type = "class")  # factor
          1       2       3       4       5       6       7       8       9      10 
    present  absent present present  absent  absent  absent  absent  absent present 
         11      12      13      14      15      16      17      18      19      20 
    present  absent present  absent  absent  absent  absent  absent  absent  absent 
     21      22      23      24      25      26      27      28      29      30 

Levels: absent present

You can also store the result in a variable and look at the structure of that variable to confirm the specific output objects using str() function. Notice the prediction attribute values and Factor names attributes are clearly specified here.

 p<-predict(fit, type = "class")
 str(p)
 Factor w/ 2 levels "absent","present": 2 1 2 2 1 1 1 1 1 2 ...
 - attr(*, "names")= chr [1:81] "1" "2" "3" "4" ...
Related Question