Solved – Confidence interval for the intercept in logistic regression

confidence intervalinterceptlogisticspss

Some major commercial statistical packages (e.g., SPSS) do not report a CI for the intercept term in logistic regression. [Based on answer below R does provide CI for intercept]

Why might confidence intervals for the intercept term not be included by default?

UPDATE:

Based on feedback, Confidence Intervals for odds associated with intercept term are reported in some stat. packages. And, obviously, they can be easily computed manually, knowing the standard error. Therefore, reformulated question:

What is the interpretation of intercept's CIs (for odds) in logistic regression?

Best Answer

Well, that's not correct. You can use confint funtion in both S-plus and R to obtain C.I. for the estimated parameters. I will give an example for logistic regression in R since I don't have the S-plus:

> mydata <- read.csv("http://www.ats.ucla.edu/stat/data/binary.csv")
> mylogit <- glm(admit ~ gre + gpa + rank, data = mydata, family = "binomial")
> confint(mylogit)
Waiting for profiling to be done...
                    2.5 %       97.5 %
(Intercept) -5.7109591680 -1.260314066
gre          0.0001715446  0.004461385
gpa          0.1415710585  1.428341503
rank        -0.8149612229 -0.315479733
>

Related Solutions

Solved – Understanding confidence intervals in Firth penalized logistic regression

The fact that firth=FALSE doesn't give similar results to glm is puzzling to me -- hopefully someone else can answer. As far as pl goes, though, you're almost always better off with profile confidence intervals. The Wald confidence intervals assume that the (implicit) log-likelihood surface is locally quadratic, which is often a bad approximation. Except that they're more computationally intensive, profile confidence intervals are always (? I would welcome counterexamples ?) more accurate. The "improved" p values you get from the Wald estimates are likely overoptimistic.

Generate data:

dd <- data.frame(X=rep(c("yes","no"),c(22,363)),
             Y=rep(c("no","yes","no"),c(22,7,356)))
with(dd,table(X,Y))

Replicate:

m_glm <-glm(Y~X,family=binomial,data=dd)
library("logistf")
m_fp <-logistf(Y~X,data=dd,pl=TRUE,firth=TRUE)
m_mp <- logistf(Y~X,data=dd,pl=TRUE,firth=FALSE)
m_fw <-logistf(Y~X,data=dd,pl=FALSE,firth=TRUE)
m_mw <-logistf(Y~X,data=dd,pl=FALSE,firth=FALSE)

Compare Wald (confint.default) with profile CIs for glm (in this case the profile intervals are actually narrower).

confint.default(m_glm)  ## {-2740, 2710}
confint(m_glm)          ## {NA, 118}

Comparing with the glm2 package (just to make sure that glm isn't doing something wonky).

library("glm2")
glm2(Y~X,family=binomial,data=dd)
## similar results to glm(...)

Solved – Computing Confidence Intervals for Odds Ratio Based on Fitted Values

As I've commented at Stack Overflow, you could bootstrap them:

library(boot)
set.seed(42)
bootres <- boot(Arthritis, function(DF, i) {
  fit <- glm(Y ~ Sex + Treatment, family=binomial, data=Arthritis[i,])
  df <- expand.grid(Sex=levels(Arthritis$Sex), 
                    Treatment=levels(Arthritis$Treatment)) %>%
    mutate(link=predict(fit, newdata=., type="link"),
           se=predict(fit, newdata=., type="link", se.fit=T)$se.fit,
           odds=exp(link))
  df$odds/filter(df, Sex=="Female" & Treatment=="Placebo")$odds

}, R = 1000, strata = interaction(Arthritis$Treatment, Arthritis$Sex))

apply(bootres$t, 2, quantile, probs = c(0.025, 0.5, 0.975))
#      [,1]       [,2]      [,3]      [,4]
#2.5%     1 0.06236299  2.397555 0.3620032
#50%      1 0.21591348  6.295761 1.4285296
#97.5%    1 0.65358623 20.641313 5.2671106

I show stratified bootstrap here, but you get very similar results for non-stratified bootstrap.

Best Answer

Related Solutions

Solved – Understanding confidence intervals in Firth penalized logistic regression

Solved – Computing Confidence Intervals for Odds Ratio Based on Fitted Values

Related Question