I have created random forest for the unsupervised case.
g = randomForest(iris[,-5],keep.forest=TRUE)
Now I need to know the class probabilities for each entry (with respect to iris$Species). In case of a supervised case, then I would use this code:
p = predict(g, iris, type = "prob")
However, for the unsupervised case it says:
Can't predict unsupervised forest.
So, how can I extract the class probabilities?
Best Answer
In unsupervised case randomForest produces a proximity matrix that you can use for clustering.
MDS stands for Multi-dimensional Scaling.
Of course the clusters won't one-on-one map to original classes (that's why I deliberately didn't remap clusters - so it's not a confusion matrix:
Two dimensional MDS plot:
Then you can use your clusters as classes to train a supervised model:
For the sake of our example and because Iris dataset is so short, we generate a simulated Iris dataset:
Now we can predict on the new dataset and inspect how it is in agreement with the simulated dataset's species class:
To predict probabilities: