Odds Ratios – Is It Normal to Get Very High Odds Ratios and Confidence Interval

confidence intervalgeneralized linear modellogisticodds-ratior

I am performing a logistic regression with different variables by R.

The code used has been this one:

glmer.1 <- glm(PREVALENCIA_EPOC ~ Categorizacion + SEX + 
             TABACO + 
             ESTUDIO + EXP3+ IMC2+ rdoPDB+ MMRC+Diagnostico ,
           data = BD_GLM,
           family = binomial(link="logit"))

TABLA <- logistic.display(glmer.1,crude.p.value = TRUE, 
            decimal = 3)

However, when I look at the results I get high odds ratios and CI95%:

    Variable    Categories  Sample size     OR CI95%    Pvalue Adjusted OR CI95% Pvalue_adjust
         Age    40-49          328              REF
                50-59   414 2.37 (1.138,4.938)  0.0212  4.134 (1.758,9.721)          0.0011
                60-69   309 3.021 (1.422,6.417)     0.004   5.956 (2.314,15.334)    < 0.001
                70-79   208 6.706 (3.244,13.862)    < 0.001 11.838 (4.443,31.541)   < 0.001
                >=80    52  29.864 (4.517,197.46)   < 0.001 51.403 (2.43,1087.237)  0.0114
    Sex          Men    628 2.634 (1.677,4.139)     < 0.001 3.905 (1.992,7.655)     < 0.001
                Women   687                     REF
Tobacco habit   Smoker  340 3.381 (1.888,6.055)     < 0.001 9.299 (4.006,21.584)    < 0.001
              Ex Smoker 449 2.457 (1.372,4.402)     0.0025  3.812 (1.716,8.469)     0.001
              Non-smoker    526                REF
    Studies      Less   49  2.643 (1.233,5.665)     0.0125  2.42 (0.807,7.26)          0.1149
                 Primary    241 1.175 (0.672,2.053)     0.5723  1.021 (0.474,2.199)     0.9573
                 Secondary  339 0.804 (0.446,1.448)     0.4673  0.873 (0.418,1.825)     0.7177
                 University 675 3.398 (0.831,13.896)    0.0887  3.231 (0.601,17.364)    0.1717
                 NS/NC  11                     REF  
      Risk       Yes    542 0.825 (0.52,1.31)   0.4153  1.635 (0.83,3.221)             0.1554
                  No    228                    REF
     IMC                    1315    0.983 (0.938,1.031)     0.4827  0.903 (0.843,0.968)     0.0037
 Prueba positiva    Yes 1188    0.205 (0.127,0.333)     < 0.001 0.257 (0.138,0.48)  < 0.001
                     No 122                    REF
Actividad física     Grade 1    878            REF
                     Grade 2    368 2.527 (1.493,4.277)     < 0.001 1.715 (0.827,3.556)     0.1475
                     Grade 3    55  6.541 (2.917,14.666)    < 0.001 2.277 (0.643,8.062)     0.2022
                     Grade 4    13  15.164 (2.48,92.704)    0.0032  8.32 (0.683,101.399)    0.0968
                     Grade 5    1               
Diagnóstico previo  Yes 1088    0.103 (0.063,0.167)     < 0.001 0.097 (0.049,0.193)     < 0.001
                     No 227                  REF

As you can see in the variable Age >=80 the OR is 51.403 with a CI between (2.43 – 1087.237). I don't know if this is due to the sample size. I have a total sample size of 764 and a much smaller sample size per category. Is there a sample size limit for each category so that the test has more power?

Best Answer

Well, in this specific situation I'd underline the fact that, in that specific subgroup you only have 51 patient, thus the standard error from the regression will be quite high. Furthermore, you should check if there are 0 values in that group, since they could make the CI even wider.

Also, have you tried to do the regression directly in that subgroup and then use the exp e confint function to check if they are the same of the logistic display?