Each point on the partial dependence plot is the average vote percentage in favor of the "Yes trees" class across all observations, given a fixed level of TRI.
It's not a probability of correct classification. It has absolutely nothing to do with accuracy, true negatives, and true positives.
When you see the phrase
Values greater than TRI 30 begin to have a positive influence for classification in your model
is an puffed-up way of saying
Values greater than TRI 30 begin to predict "Yes trees" more strongly than values lower than TRI 30
Thanks to Wilco Emons for the following solution to the problem:
In polr
the cumulative link model is parameterized a bit different than in Agresti’s book that is referred to. The problem can be easily solved by changing the code line:
probLALR[,k] <- inv.logit(b[k] + a[1]*0 + a[2]*0 + a[3]*Pred + a[4]*0*0 +
a[5]*Pred*0 + a[6]*Pred*0 )
into
probLALR[,k] <- inv.logit(b[k] - (a[1]*0 + a[2]*0 + a[3]*Pred + a[4]*0*0 +
a[5]*Pred*0 + a[6]*Pred*0) )
Thanks also to Achim Zeileis for his suggestion to use predict(m2, type="prob")
!
Below you will find a way to calculate the probabilities by the means of the predict()
function:
Pred <- seq(-3, 3, by=0.01)
PRED.LALR <- data.frame(rep(NA,601))
PRED.LALR$f.adm <- as.factor(rep(0,601))
PRED.LALR$f.riv <- as.factor(rep(0,601))
PRED.LALR$RIV.st <- Pred
prob.LALR <- predict(m2,PRED.LALR,type="prob")
scoreLALR <- prob.LALR[,1]*1 + prob.LALR[,2]*2 + prob.LALR[,3]*3 +
prob.LALR[,4]*4 + prob.LALR[,5]*5 + prob.LALR[,6]*6
plot(Pred, scoreLALR, col="green", ylim=c(1,6))
Best Answer
Partial plots don't have to indicate in the same direction of the data univariately, in fact, this is what makes them useful.
Partial plots are showing you the marginal effect of just this variable. It is likely that there are predictors in your dataset heavily correlated with
Education=10th
andEducation=Doctorate
that already account for the univariate effect. Once that effect is controlled for,Education=Doctorate
really does reduce your propensity to be whatever your IV is.Here's a contrived example. Imagine we're trying to predict
drinks_coffee
, and have data like this:Univariately,
education=Doctorate
seems to imply greater propensity to drink coffee. However, if we includelikes_coffee
in a model, the effect of havingeducation=Doctorate
actually decreases your propensity to drink coffee.likes_coffee
soaks up the overwhelming majority of the signal, but it's only possible to like coffee and not drink it you have a Doctorate (starred row).Does
education
come high in relative influence? Are there other big predictors that could be explaining the massive univariate difference? Of course, it's always possible your model has a bug in it.