The fact that firth=FALSE
doesn't give similar results to glm
is puzzling to me -- hopefully someone else can answer. As far as pl
goes, though, you're almost always better off with profile confidence intervals. The Wald confidence intervals assume that the (implicit) log-likelihood surface is locally quadratic, which is often a bad approximation. Except that they're more computationally intensive, profile confidence intervals are always (? I would welcome counterexamples ?) more accurate. The "improved" p values you get from the Wald estimates are likely overoptimistic.
Generate data:
dd <- data.frame(X=rep(c("yes","no"),c(22,363)),
Y=rep(c("no","yes","no"),c(22,7,356)))
with(dd,table(X,Y))
Replicate:
m_glm <-glm(Y~X,family=binomial,data=dd)
library("logistf")
m_fp <-logistf(Y~X,data=dd,pl=TRUE,firth=TRUE)
m_mp <- logistf(Y~X,data=dd,pl=TRUE,firth=FALSE)
m_fw <-logistf(Y~X,data=dd,pl=FALSE,firth=TRUE)
m_mw <-logistf(Y~X,data=dd,pl=FALSE,firth=FALSE)
Compare Wald (confint.default
) with profile CIs for glm
(in this case the profile intervals are actually narrower).
confint.default(m_glm) ## {-2740, 2710}
confint(m_glm) ## {NA, 118}
Comparing with the glm2
package (just to make sure that glm
isn't doing something wonky).
library("glm2")
glm2(Y~X,family=binomial,data=dd)
## similar results to glm(...)
As I've commented at Stack Overflow, you could bootstrap them:
library(boot)
set.seed(42)
bootres <- boot(Arthritis, function(DF, i) {
fit <- glm(Y ~ Sex + Treatment, family=binomial, data=Arthritis[i,])
df <- expand.grid(Sex=levels(Arthritis$Sex),
Treatment=levels(Arthritis$Treatment)) %>%
mutate(link=predict(fit, newdata=., type="link"),
se=predict(fit, newdata=., type="link", se.fit=T)$se.fit,
odds=exp(link))
df$odds/filter(df, Sex=="Female" & Treatment=="Placebo")$odds
}, R = 1000, strata = interaction(Arthritis$Treatment, Arthritis$Sex))
apply(bootres$t, 2, quantile, probs = c(0.025, 0.5, 0.975))
# [,1] [,2] [,3] [,4]
#2.5% 1 0.06236299 2.397555 0.3620032
#50% 1 0.21591348 6.295761 1.4285296
#97.5% 1 0.65358623 20.641313 5.2671106
I show stratified bootstrap here, but you get very similar results for non-stratified bootstrap.
Best Answer
Well, that's not correct. You can use
confint
funtion in both S-plus and R to obtain C.I. for the estimated parameters. I will give an example for logistic regression in R since I don't have the S-plus: