Solved – Why glmnet gives different coefficient estimates with same data

I have been trying to fit a lasso model using cv.glmnet. I tried to implement four different models (3 using cv.glmnet and 1 using caret::train) based on standardization. All the four models gives very different coefficient estimates which I can't seem to figure out why.

Here is a fully reproducible code:

iris <- iris
dat <- iris[iris$Species %in% c("setosa","versicolor"),]
X <- as.matrix(dat[,1:4])
Y <- as.factor(as.character(dat$Species))

model1 <- cv.glmnet(x = X,
                    y = Y,
                    family = "binomial",
                    standardize = FALSE,
                    alpha = 1,
                    lambda = rev(seq(0,1,length=100)),
                    nfolds = 3)

model2 <- cv.glmnet(x = scale(X, center = T, scale = T),
                    y = Y,
                    family = "binomial",
                    standardize = FALSE,
                    alpha = 1,
                    lambda = rev(seq(0,1,length=100)),
                    nfolds = 3)
model3 <- cv.glmnet(x = X,
                    y = Y,
                    family = "binomial",
                    standardize = TRUE,
                    alpha = 1,
                    lambda = rev(seq(0,1,length=100)),
                    nfolds = 3)

##Using caret

lambda.grid <- rev(seq(0,1,length=100)) #set of lambda values for cross-validation
alpha.grid <- 1 #alpha
trainControl <- trainControl(method ="cv",
                             number=3) #3-fold cross-validation
tuneGrid <- expand.grid(.alpha=alpha.grid, .lambda=lambda.grid) #these are tuning parameters to be passed into the train function below

model4 <- train(x = X,
                y = Y,
                standardize = FALSE,
                trControl = trainControl,                          
                tuneGrid = tuneGrid)

c1 <- coef(model1, s=model1$lambda.min)
c2 <- coef(model2, s=model2$lambda.min)
c3 <- coef(model3, s=model3$lambda.min)
c4 <- coef(model4$finalModel, s=model4$finalModel$lambdaOpt)
c1 <- as.matrix(c1)
c2 <- as.matrix(c2)
c3 <- as.matrix(c3)
c4 <- as.matrix(c4)

model2 scales the independent variables (vector X) beforehand and model3 does so by setting standardize = TRUE. So atleast these two models should return identical results – but it is not so.

The lambda.min obtained from the four models are:
model1 = 0
model2 = 0
model3 = 0
model4 = 0.6565657

The coefficient estimates between the models differ drastically too. Why would this be occuring?

Best Answer

You might find this link helpful.


  • c1 are coefficients for unstandardized data (in the original units of the data)
  • c2 are coefficients for standardized data (in z score units)
  • c3 are coefficients for standardized data (in original units of the data)
  • c4 is because of this