Solved – Getting to predicted values using cv.glmnet

glmnetpredictionr

I'm a little confused by the predict function with a cv.glmnet object.

I'm running these two lines:

cvFit <- cv.glmnet(x = as.matrix(imputedTrainingData[,2:33]), y = imputedTrainingData[,1], family = "binomial", type.measure = "class" )

response<-predict(cvFit, as.matrix(imputedTestData[,2:33]), s= "lambda.min")

The y variable is a 2-level factor

Why is it that the predict statement gives a numeric vector and not the the class variable outcome predicted?
I thought for a moment that perhaps it gives the probability or being in one class or another but the max value of results is just above .35 in my data and the min is -.42.

Thanks!

Best Answer

Note that you are using the predict.cv.glmnet method when called as you did. The help for this function is a bit counterintuitive, but you can pass arguments to the predict.glmnet method, which does the predictions, via the ... argument.

Hence you probably want

response <- predict(cvFit, as.matrix(imputedTestData[,2:33]),
                    s = "lambda.min",
                    type = "class")

where type = "class" has meaning:

  Type ‘"class"’ applies only to
  ‘"binomial"’ or ‘"multinomial"’ models, and produces the
  class label corresponding to the maximum probability.

(from ?predict.glmnet)

What you were seeing was the predicted values on the scale of the linear predictor (link function), i.e. before the inverse of the logit function had been applied to yield probability of class == 1. This is fairly typical in R, and just as typically this behaviour can be controlled via a type argument.

Related Question