Logistic Regression vs Multinomial Regression – Differences and Use Cases Explained

logisticmultinomial-distributionr

Is it viable to do several binary logistic regressions instead of doing a multinomial regression? From this question: Multinomial logistic regression vs one-vs-rest binary logistic regression I see that the multinomial regression might have lower standard errors.

However, the package I would like to utilize has not been generalized to multinomial regression (ncvreg: http://cran.r-project.org/web/packages/ncvreg/ncvreg.pdf) and so I was wondering if I could simply do several binary logistic regressions instead.

Best Answer

With a multinomial logit model you impose the constraint that all the predicted probabilities add up to 1. When you use separate binary logit model you can no longer impose that constraint, they are estimated in seperate models after all. So that would be the main difference between these two models.

As you can see in the example below (In Stata, as that is the program I know best), the models tend to be similar but not the same. I would be especially careful about extrapolating predicted probabilities.

// some data preparation
. sysuse nlsw88, clear                                                               
(NLSW, 1988 extract)                                                                 

.                                                                                    
. gen byte occat = cond(occupation < 3                 , 1,      ///                 
>                  cond(inlist(occupation, 5, 6, 8, 13), 2, 3))  ///                 
>                  if !missing(occupation)                                           
(9 missing values generated)                                                         

. label variable occat "occupation in categories"                                    

. label define occat 1 "high"   ///                                                  
>                    2 "middle" ///                                                  
>                    3 "low"                                                         

. label value occat occat                                                            

.                                                                                    
. gen byte middle = (occat == 2) if occat !=1 & !missing(occat)                      
(590 missing values generated)                                                       

. gen byte high   = (occat == 1) if occat !=2 & !missing(occat)                      
(781 missing values generated)                                                       


// a multinomial logit model
. mlogit occat i.race i.collgrad , base(3) nolog                                     

Multinomial logistic regression                   Number of obs   =       2237       
                                                  LR chi2(6)      =     218.82       
                                                  Prob > chi2     =     0.0000       
Log likelihood = -2315.9312                       Pseudo R2       =     0.0451       

-------------------------------------------------------------------------------      
        occat |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]      
--------------+----------------------------------------------------------------      
high          |                                                                      
         race |                                                                      
       black  |  -.4005801   .1421777    -2.82   0.005    -.6792433    -.121917      
       other  |   .4588831   .4962591     0.92   0.355    -.5137668    1.431533      
              |                                                                      
     collgrad |                                                                      
college grad  |   1.495019   .1341625    11.14   0.000     1.232065    1.757972      
        _cons |  -.7010308   .0705042    -9.94   0.000    -.8392165   -.5628451      
--------------+----------------------------------------------------------------      
middle        |                                                                      
         race |                                                                      
       black  |   .6728568   .1106792     6.08   0.000     .4559296     .889784      
       other  |   .2678372    .509735     0.53   0.599    -.7312251    1.266899      
              |                                                                      
     collgrad |                                                                      
college grad  |    .976244   .1334458     7.32   0.000      .714695    1.237793      
        _cons |   -.517313   .0662238    -7.81   0.000    -.6471092   -.3875168      
--------------+----------------------------------------------------------------      
low           |  (base outcome)                                                      
-------------------------------------------------------------------------------      

// separate logits:
. logit high   i.race i.collgrad , nolog                                             

Logistic regression                               Number of obs   =       1465       
                                                  LR chi2(3)      =     154.21       
                                                  Prob > chi2     =     0.0000       
Log likelihood = -906.79453                       Pseudo R2       =     0.0784       

-------------------------------------------------------------------------------      
         high |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]      
--------------+----------------------------------------------------------------      
         race |                                                                      
       black  |  -.5309439   .1463507    -3.63   0.000     -.817786   -.2441017      
       other  |   .2670161   .5116686     0.52   0.602     -.735836    1.269868      
              |                                                                      
     collgrad |                                                                      
college grad  |   1.525834   .1347081    11.33   0.000     1.261811    1.789857      
        _cons |  -.6808361   .0694323    -9.81   0.000     -.816921   -.5447512      
-------------------------------------------------------------------------------      

. logit middle i.race i.collgrad , nolog                                             

Logistic regression                               Number of obs   =       1656       
                                                  LR chi2(3)      =      90.13       
                                                  Prob > chi2     =     0.0000       
Log likelihood = -1098.9988                       Pseudo R2       =     0.0394       

-------------------------------------------------------------------------------      
       middle |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]      
--------------+----------------------------------------------------------------      
         race |                                                                      
       black  |   .6942945   .1114418     6.23   0.000     .4758725    .9127164      
       other  |   .3492788   .5125802     0.68   0.496    -.6553598    1.353918      
              |                                                                      
     collgrad |                                                                      
college grad  |   .9979952   .1341664     7.44   0.000     .7350339    1.260957      
        _cons |  -.5287625   .0669093    -7.90   0.000    -.6599023   -.3976226      
-------------------------------------------------------------------------------

Related Solutions

Solved – Meaning of intercept in multinomial regression with binary predictors

In a multinomial logistic regression with 3 levels of the DV there ought to be two intercepts. How exactly these are defined depends on which is the reference level. These will be the value of the logit when the independent variables are 0, in your case, when risk is high.

I wrote a presentation on multinomial and ordinal logistic regression; it somewhat concentrated on SAS, but some may be useful even if you are using another package.

Solved – KNN imputation R packages

You could also try the following package: DMwR.

It failed on the case of 3 NN, giving 'Error in knnImputation(x, k = 3) : Not sufficient complete cases for computing neighbors.'

However, trying 2 gives.

> knnImputation(x,k=2)
             [,1]       [,2]       [,3]       [,4]       [,5]        [,6]
 [1,] -0.59091360 -1.2698175  0.5556009 -0.1327224 -0.8325065  0.71664000
 [2,] -1.27255074 -0.7853602  0.7261897  0.2969900  0.2969556 -0.44612831
 [3,]  0.55473981  0.4748735  0.5158498 -0.9493917 -1.5187722 -0.99377854
 [4,] -0.47797654  0.1647818  0.6167311 -0.5149731  0.5240514 -0.46027809
 [5,] -1.08767831 -0.3785608  0.6659499 -0.7223724 -0.9512409 -1.60547053
 [6,] -0.06153279  0.9486815 -0.5464601  0.1544475  0.2835521 -0.82250221
 [7,] -0.82536029 -0.2906253 -3.0284281 -0.8473210  0.7985286 -0.09751927
 [8,] -1.15366189  0.5341000 -1.0109258 -1.5900281  0.2742328  0.29039928
 [9,] -1.49504465 -0.5419533  0.5766574 -1.2412777 -1.4089572 -0.71069839
[10,] -0.35935440 -0.2622265  0.4048126 -2.0869817  0.2682486  0.16904559
             [,7]       [,8]        [,9]      [,10]
 [1,]  0.58027159 -1.0669137  0.48670802  0.5824858
 [2,] -0.48314440 -1.0532693 -0.34030385 -1.1041681
 [3,] -2.81996446  0.3191438 -0.48117020 -0.0352633
 [4,] -0.55080515 -1.0620243 -0.51383557  0.3161907
 [5,] -0.56808769 -0.3696951  0.35549191  0.3202675
 [6,] -0.25043479 -1.0389393  0.07810902  0.5251606
 [7,] -0.41667318  0.8809541 -0.04613332 -1.1586756
 [8,] -0.06898363 -1.0736161  0.62698065 -1.0373835
 [9,]  0.30051583 -0.2936140  0.31417921 -1.4155193
[10,] -0.68180034 -1.0789745  0.58290920 -1.0197956

You can test for sufficient observations using complete.cases(x), where that value must be at least k.

One way to overcome this problem is to relax your requirements (i.e. less incomplete rows), by 1) increasing the NA threshold, or alternatively, 2) increasing your number of observations.

Here is the first:

> x = matrix(rnorm(100),10,10)
> x.missing = x > 2
> x[x.missing] = NA
> complete.cases(x)
 [1]  TRUE  TRUE  TRUE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE
> knnImputation(x,k=3)
             [,1]       [,2]       [,3]       [,4]        [,5]       [,6]       [,7]        [,8]        [,9]       [,10]
 [1,]  0.86882569 -0.2409922  0.3859031  0.5818927 -1.50310330  0.8752261 -0.5173105 -2.18244988 -0.28817656 -0.63941237
 [2,]  1.54114079  0.7227511  0.7856277  0.8512048 -1.32442954 -2.1668744  0.7017532 -0.40086348 -0.41251883  0.42924986
 [3,]  0.60062917 -0.5955623  0.6192783 -0.3836310  0.06871570  1.7804657  0.5965411 -1.62625036  1.27706937  0.72860273
 [4,] -0.07328279 -0.1738157  1.4965579 -1.1686115 -0.06954318 -1.0171604 -0.3283916  0.63493884  0.72039689 -0.20889111
 [5,]  0.78747874 -0.8607320  0.4828322  0.6558960 -0.22064430  0.2001473  0.7725701  0.06155196  0.09011719 -1.01902968
 [6,]  0.17988720 -0.8520000 -0.5911523  1.8100573 -0.56108621  0.0151522 -0.2484345 -0.80695513 -0.18532984 -1.75115335
 [7,]  1.03943492  0.4880532 -2.7588922 -0.1336166 -1.28424057  1.2871333  0.7595750 -0.55615677 -1.67765572 -0.05440992
 [8,]  1.12394474  1.4890366 -1.6034648 -1.4315445 -0.23052386 -0.3536677 -0.8694188 -0.53689507 -1.11510406 -1.39108817
 [9,] -0.30393916  0.6216156  0.1559639  1.2297105 -0.29439390  1.8224512 -0.4457441 -0.32814665  0.55487894 -0.22602598
[10,]  1.18424722 -0.1816049 -2.2975095 -0.7537477  0.86647524 -0.8710603  0.3351710 -0.79632184 -0.56254688 -0.77449398
> x
             [,1]       [,2]       [,3]       [,4]       [,5]       [,6]       [,7]        [,8]        [,9]       [,10]
 [1,]  0.86882569 -0.2409922  0.3859031  0.5818927 -1.5031033  0.8752261 -0.5173105 -2.18244988 -0.28817656 -0.63941237
 [2,]  1.54114079  0.7227511  0.7856277  0.8512048 -1.3244295 -2.1668744  0.7017532 -0.40086348 -0.41251883  0.42924986
 [3,]  0.60062917 -0.5955623  0.6192783 -0.3836310  0.0687157  1.7804657  0.5965411 -1.62625036  1.27706937  0.72860273
 [4,] -0.07328279 -0.1738157  1.4965579 -1.1686115         NA -1.0171604 -0.3283916  0.63493884  0.72039689 -0.20889111
 [5,]  0.78747874 -0.8607320  0.4828322         NA -0.2206443  0.2001473  0.7725701  0.06155196  0.09011719 -1.01902968
 [6,]  0.17988720 -0.8520000 -0.5911523  1.8100573 -0.5610862  0.0151522 -0.2484345 -0.80695513 -0.18532984 -1.75115335
 [7,]  1.03943492  0.4880532 -2.7588922 -0.1336166 -1.2842406  1.2871333  0.7595750 -0.55615677 -1.67765572 -0.05440992
 [8,]  1.12394474  1.4890366 -1.6034648 -1.4315445 -0.2305239 -0.3536677 -0.8694188 -0.53689507 -1.11510406 -1.39108817
 [9,] -0.30393916  0.6216156  0.1559639  1.2297105 -0.2943939  1.8224512 -0.4457441 -0.32814665  0.55487894 -0.22602598
[10,]  1.18424722 -0.1816049 -2.2975095 -0.7537477  0.8664752 -0.8710603  0.3351710 -0.79632184 -0.56254688 -0.77449398

here is an example of the 2nd...

x = matrix(rnorm(1000),100,10)
x.missing = x > 1
x[x.missing] = NA

complete.cases(x)

  [1]  TRUE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE  TRUE FALSE FALSE
 [22] FALSE FALSE  TRUE FALSE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [43]  TRUE FALSE FALSE  TRUE FALSE FALSE FALSE  TRUE FALSE FALSE  TRUE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [64] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE
 [85] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE FALSE

At least k=3 complete rows are satisfied, thus it is able to impute for k=3.

> head(knnImputation(x,k=3))
            [,1]       [,2]       [,3]       [,4]       [,5]       [,6]       [,7]       [,8]        [,9]       [,10]
[1,]  0.01817557 -2.8141502  0.3929944  0.1495092 -1.7218396  0.4159133 -0.8438809  0.6599224 -0.02451113 -1.14541016
[2,]  0.51969964 -0.4976021 -0.1495392 -0.6448184 -0.6066386 -1.6210476 -0.3118440  0.2477855 -0.30986749  0.32424673
...

Best Answer

Related Solutions

Solved – Meaning of intercept in multinomial regression with binary predictors

Solved – KNN imputation R packages

Related Question