Solved – How to fit the coefficient for glmnet in multinomial logistic regression using lasso in r

glmnetlassomultinomial-distribution

I have a problem. I use 'glmnet' package to fit my multinomial logistic regression using lasso in R.
Well, I have 4 categories in my response: 'SMA', 'SMK', 'MA' and 'Tidak Melanjutkan'. I want 'SMA' as reference category. When i use the 'coef' function, I should get the coefficient of 3 categories ('SMK', 'MA' and 'Tidak Melanjutkan'), instead I got all 4 (Including 'SMA' itself) in my output.

This is my Lasso code in R for the coefficient:

vfit = cv.glmnet(X,Y,family="multinomial")
coef(cvfit, s = "lambda.1se")

I thought, the is no problem with the cross validation regarding my chosen value of lambda. The strange thing for me is the output. And this is the output I got:

$SMA
34 x 1 sparse Matrix of class "dgCMatrix"
                                       1
(Intercept)                      -2.31541796
X1                                0.05076343
X2SMP                             0.17475872
X3Petani.Nelayan                 -0.13696304
X3Wiraswasta                      .         
X3PNS                             .         
X4Petani.Nelayan                  .         
X4Wiraswasta                      .         
X4PNS                             0.50508203
X4Ibu.Rumah.Tangga                .         
X5SD                             -0.07704557
X5SMP.Sederajat                  -0.05454831
X5SMA.Sederajat                   .         
X6SD                             -0.12731086
X6SMP.Sederajat                  -0.19367530
X6SMA.Sederajat                   .         
X7Tidak.Berpenghasilan            .         
X7.2.juta                         .         
X72.5.juta                        0.07219772
X8Tidak.Berpenghasilan            .         
X8.2.juta                         .         
X82.5.juta                        0.34388895
X91.anak                          .         
X92.anak                          .         
X93.anak                          .         
X10Melanjutkan.Pendidikan.Tinggi  0.56931042
X10Langsung.bekerja              -0.58191964
X11Biaya.yang.Murah               .         
X11Banyak.teman.yang.dikenal      .         
X11Fasilitas.yang.baik            .         
X11Lokasi.dekat.dengan.rumah      .         
X12Tidak.Ada.Diri.Sendiri        -0.05605108
X12Keluarga                       .         
X12Teman                          .         

$SMK
34 x 1 sparse Matrix of class "dgCMatrix"
                                       1
(Intercept)                       1.93666040
X1                                .         
X2SMP                             .         
X3Petani.Nelayan                  .         
X3Wiraswasta                      0.25357207
X3PNS                            -0.15172475
X4Petani.Nelayan                  .         
X4Wiraswasta                      .         
X4PNS                             .         
X4Ibu.Rumah.Tangga                .         
X5SD                              .         
X5SMP.Sederajat                   .         
X5SMA.Sederajat                   0.17115118
X6SD                              .         
X6SMP.Sederajat                   0.04149902
X6SMA.Sederajat                   .         
X7Tidak.Berpenghasilan            .         
X7.2.juta                         .         
X72.5.juta                        .         
X8Tidak.Berpenghasilan            .         
X8.2.juta                         .         
X82.5.juta                        .         
X91.anak                          .         
X92.anak                          .         
X93.anak                         -0.08781950
X10Melanjutkan.Pendidikan.Tinggi -0.39001009
X10Langsung.bekerja               0.77684899
X11Biaya.yang.Murah               .         
X11Banyak.teman.yang.dikenal     -0.40171656
X11Fasilitas.yang.baik            0.26320761
X11Lokasi.dekat.dengan.rumah     -0.72542746
X12Tidak.Ada.Diri.Sendiri         .         
X12Keluarga                       .         
X12Teman                          .         

$MA
34 x 1 sparse Matrix of class "dgCMatrix"
                                        1
(Intercept)                       1.588636804
X1                                .          
X2SMP                            -2.356876639
X3Petani.Nelayan                  0.188907523
X3Wiraswasta                     -0.068776196
X3PNS                             .          
X4Petani.Nelayan                  .          
X4Wiraswasta                      .          
X4PNS                             .          
X4Ibu.Rumah.Tangga                0.008251606
X5SD                              .          
X5SMP.Sederajat                   .          
X5SMA.Sederajat                   .          
X6SD                              .          
X6SMP.Sederajat                   .          
X6SMA.Sederajat                   .          
X7Tidak.Berpenghasilan            .          
X7.2.juta                         .          
X72.5.juta                        .          
X8Tidak.Berpenghasilan            .          
X8.2.juta                         .          
X82.5.juta                        .          
X91.anak                          .          
X92.anak                          .          
X93.anak                          .          
X10Melanjutkan.Pendidikan.Tinggi  0.102615382
X10Langsung.bekerja              -0.034986628
X11Biaya.yang.Murah               .          
X11Banyak.teman.yang.dikenal      .          
X11Fasilitas.yang.baik            .          
X11Lokasi.dekat.dengan.rumah      .          
X12Tidak.Ada.Diri.Sendiri         .          
X12Keluarga                       .          
X12Teman                          .          

$`Tidak Melanjutkan`
34 x 1 sparse Matrix of class "dgCMatrix"
                                       1
(Intercept)                      -1.20987925
X1                                .         
X2SMP                             .         
X3Petani.Nelayan                  .         
X3Wiraswasta                      .         
X3PNS                             .         
X4Petani.Nelayan                  .         
X4Wiraswasta                      .         
X4PNS                             .         
X4Ibu.Rumah.Tangga                .         
X5SD                              .         
X5SMP.Sederajat                   .         
X5SMA.Sederajat                   .         
X6SD                              .         
X6SMP.Sederajat                   .         
X6SMA.Sederajat                   .         
X7Tidak.Berpenghasilan            .         
X7.2.juta                         .         
X72.5.juta                        .         
X8Tidak.Berpenghasilan            .         
X8.2.juta                         .         
X82.5.juta                        .         
X91.anak                          .         
X92.anak                          .         
X93.anak                          .         
X10Melanjutkan.Pendidikan.Tinggi -0.10261538
X10Langsung.bekerja               0.03498663
X11Biaya.yang.Murah               1.27821100
X11Banyak.teman.yang.dikenal      0.08722154
X11Fasilitas.yang.baik            .         
X11Lokasi.dekat.dengan.rumah      .         
X12Tidak.Ada.Diri.Sendiri         .         
X12Keluarga                       .         
X12Teman                          .      

That output confused me. How could still be 4 categories? What's the reference of my first category 'SMA' that it could have coefficient?

Meanwhile, I'd tested using the 'nnet' package without Lasso regularization.

data$Y2 <- relevel(data$Y, ref = "SMA")
test <- multinom(Y2 ~ X, data = data)
summary(test)

The output of coefficient (excluding SE and significance value) was quite logical. There were only 3 logits as I expected ('SMK', 'MA' and 'Tidak Melanjutkan'). I can't post the result because of its messy view.

The thing is, I had analyzed the Multinomial Logistic Regression using SPSS. I want to compare it with Lasso Regularization which I reckoned only available in R as glmnet. My problem is the coefficient of each logits in glmnet.

Forgive me for the length and the language, I'm from Indonesia. I need to complete my minor thesis about this analysis.

Would anyone give a hand?
Any help would be highly appreciated.Thank you very much.

Best Answer

Package glmnet uses an alternative parametrization for the multinomial regression called symmetric parametrization. You can find a full explanation in section 4 of the following paper:

@article{friedman2010regularization,
  title={Regularization paths for generalized linear models via coordinate descent},
  author={Friedman, Jerome and Hastie, Trevor and Tibshirani, Rob},
  journal={Journal of statistical software},
  volume={33},
  number={1},
  pages={1},
  year={2010},
  publisher={NIH Public Access}
}

The motivation for this approach seems mainly computational, since this formulation allows using a simpler algorithm to solve the problem compared to the algorithm needed to solve the traditional K-1 parametrization.