Lower bound and upper bound of beta estimate in regression is negative and positive, respectively, for each predictor

confidence intervaleffect-sizelogisticp-valueregression

I have a logistic regression model as follows:

dependent ~ var1 + var2 + var3 + ... + var24

The results of the model show some significant beta estimates, but the upper and lower bound for the beta estimates are in opposite directions for almost all the variables.

For example, for var4, the estimate is -0.118, the upper estimate is 1.709, and the lower estimate is -1.999. The directions are opposite, and they imply completely different effects of the variable.

What is going on here? The only pattern I noticed is that when the p-value is < 0.05, the upper and lower estimate are in the same direction, and not necessarily so when the p-value is >= 0.05.

Variable Estimate Std. Error z value P value 2.5 % 97.5 %
var1 -10.5633455339251 0.759594371480191 -13.906561094365 5.77920204964665e-44 -12.0565995298244 -9.07846678375401
var2 0.834244972702608 0.0597828513393095 13.9545865413425 2.95007544494396e-44 0.717795996175918 0.95218596536246
var3 0.670686357092985 0.21666036968767 3.09556546063234 0.0019643800732445 0.247189216918815 1.09664152539369
var4 -0.11827182131671 0.944136983515969 -0.125269768456973 0.900309984899174 -1.99985360814008 1.70919144266964
var5 -0.838742349613382 1.44030127660015 -0.582338128306907 0.560338947926719 -3.99745084094769 1.7778193195088
var6 0.00351026911347644 0.00375769899895975 0.934153883652786 0.350224520205816 -0.0035456482198036 0.0113524222954051
var7 -0.000731414874584177 0.00487809942760052 -0.149938492529651 0.88081314233484 -0.00975234521520494 0.00976533653881936
var8 -1.19552692531913e-05 0.00673205576541774 -0.0017758719876631 0.998583059903854 -0.0116320385616292 0.0163799646989692
var9 0.000137595011787968 0.00472745479443718 0.0291055161330949 0.976780436425101 -0.008764713087581 0.00983780537950456
var10 -0.137314541163203 0.0393174449097 -3.49245841072767 0.00047859611846024 -0.213673700147479 -0.0594908796696738
var11 0.0279370642051698 0.0238522441522973 1.17125516688454 0.241496226532665 -0.0188823270944542 0.0746333490175093
var12 -0.00395247678315438 0.00214161113078487 -1.84556230883328 0.0649557834759472 -0.00817664917521618 0.000219701951803874
var13 1.21509909187451 0.0984290947580135 12.3449178808544 5.18871561184112e-35 1.02156762761588 1.40749581880397
var14 -0.00451544526175419 0.00394966580859281 -1.14324742410623 0.252935877243301 -0.0122986061668043 0.00318802869388361
var15 0.0135134262292429 0.0126865181256071 1.06518006717436 0.286794451982887 -0.0112553514129102 0.0384819312617267
var16 -0.0138442704104665 0.00732732910516601 -1.88940201972173 0.0588379804731442 -0.028234664566532 0.00049389410510635
var17 0.0111374372976773 0.0203408759664438 0.547539708518485 0.584007997795982 -0.0288601526385988 0.0508878526572194
var18 0.0149310101117362 0.00811568787987997 1.8397713579834 0.0658018121433485 -0.000971969252617372 0.0308463266255572
var19 0.00311471128042081 0.00948300727316004 0.328451849787824 0.74257004491251 -0.0155133744502076 0.0216618395316565
var20 0.00585217320183963 0.0143809113296191 0.406940357791264 0.684051793713323 -0.0222225404766296 0.0341636722917656
var21 0.0068148700357023 0.0120561599031068 0.565260422097266 0.571896644381092 -0.0164317427744267 0.0308539253978295
var22 0.835607905844263 0.104779189340344 7.97494150417639 1.52453009251868e-15 0.629139007916138 1.03997790608976
var23 -0.0073298137446894 0.00467548343417065 -1.56771248318829 0.116948247068106 -0.0165766110799228 0.00175730977838965
var24 -0.000266749353306101 0.00806360445460414 -0.0330806595000816 0.973610265791254 -0.0161678639143166 0.0154513944600894

Best Answer

Your "upper/lower bounds" form 95% confidence intervals. Their interpretation is: if you were to repeat your experiment many times, calculating 95% CIs each time, then 95% of these CIs would contain the true parameter (assuming your model specification is correct). Yes, that is a very cumbersome definition, but it is the best that frequentist statistics can do.

There indeed is a direct connection between p values (from standard two-sided t tests) and CIs: the CI does not contain zero exactly if $p<.05$.

In your case, most of your parameters have CIs that overlap zero. In classical parlance, the parameter estimates are not significantly different from zero. That is: you cannot reject the null hypothesis that the true value of these parameters is zero.

Also, you have a huge model. If you estimate no less than 24 parameters, I truly hope you have multiple thousands of data points. Anything less will be very overparameterized.

Related Question