Solved – Computing Confidence Intervals for Odds Ratio Based on Fitted Values

confidence intervallogisticr

I am trying to figure out how to get confidence intervals for odds ratios that were computed based on model fitted values from a logistic regression.

Here's an example with the basic idea:

#Load packages
library(vcd)
library(dplyr)
data(Arthritis)

#Create binary outcome for logistic regression
Arthritis$Y <- ifelse(Arthritis$Improved=="None", 0,1)

#Fit Model
fit <- glm(Y ~ Sex + Treatment, family=binomial, data=Arthritis)

#Create small set to predict on, get fitted values, SEs, and odds
df <- expand.grid(Sex=levels(Arthritis$Sex), 
                  Treatment=levels(Arthritis$Treatment)) %>%
  mutate(link=predict(fit, newdata=., type="link"),
         se=predict(fit, newdata=., type="link", se.fit=T)$se.fit,
         odds=exp(link))

#Odds ratio with 'Female, Placebo' as referent
df$OR <- df$odds/filter(df, Sex=="Female" & Treatment=="Placebo")$odds

Can I compute a confidence interval for the odds ratios? I've seen other responses showing how I can get a CI for the fitted estimates, but not for ratios based on those estimates. Thanks.

Best Answer

As I've commented at Stack Overflow, you could bootstrap them:

library(boot)
set.seed(42)
bootres <- boot(Arthritis, function(DF, i) {
  fit <- glm(Y ~ Sex + Treatment, family=binomial, data=Arthritis[i,])
  df <- expand.grid(Sex=levels(Arthritis$Sex), 
                    Treatment=levels(Arthritis$Treatment)) %>%
    mutate(link=predict(fit, newdata=., type="link"),
           se=predict(fit, newdata=., type="link", se.fit=T)$se.fit,
           odds=exp(link))
  df$odds/filter(df, Sex=="Female" & Treatment=="Placebo")$odds

}, R = 1000, strata = interaction(Arthritis$Treatment, Arthritis$Sex))

apply(bootres$t, 2, quantile, probs = c(0.025, 0.5, 0.975))
#      [,1]       [,2]      [,3]      [,4]
#2.5%     1 0.06236299  2.397555 0.3620032
#50%      1 0.21591348  6.295761 1.4285296
#97.5%    1 0.65358623 20.641313 5.2671106

I show stratified bootstrap here, but you get very similar results for non-stratified bootstrap.

Related Solutions

Solved – Computing confidence intervals for count data

As Hans Engler suggested, a bootstrap should work well for these data. You can use the boot or bootstrap packages directly, but it’s much easier to use the simpleboot package:

s = c(13,7,10,4,28,0,10,0,0,0,0,0,0,0,0,0,6,
      0,0,0,0,0,0,0,4,0,0,0,4,0,0,0,1,2,2,0,
      2,3,3,3,1,3,12,33,1,31,0,1,21,0,3,1,8,
      0,1,1,6,0,2,0)

library(simpleboot)
b = one.boot(s, mean, R=10^4)
boot.ci(b, type="perc")

The output is:

BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
Based on 10000 bootstrap replicates

CALL : 
boot.ci(boot.out = b, type = "perc")

Intervals : 
Level     Percentile     
95%   ( 2.10,  5.85 )  
Calculations and Intervals on Original Scale

As you can see (and as one might have expected), the confidence interval is not very different from the confidence interval from the t-test (1.87–5.69).

Solved – Calculate odds ratio confidence intervals from plink output

For the calculation of confidence intervals you'll need standard errors for the effects, but those are not available in the output. However, the standard errors can be estimated from the Wald statistics and odds ratios.

The calculation goes as follows:

Take a natural logarithm from the odds ratio. This gives you the beta from the logistic model. For example for the first row of your table: beta=ln(4.23)=1.442
The standard error for the beta is calculated by dividing the beta by the square root of the Walds statistic (STAT). Then take the absolute value of the result. Again, for the first row of your table: se=1.442/sqrt(61.5)=0.183.
The 95% confidence interval for the beta is then beta+/-1.96*se. The constant 1.96 comes from the normal distribution. Again, for the first row of data: 1.442-1.96*0.183 ... 1.442+1.96*0.183 = 1.081...1.802.
Last, you need to change the confidence interval of the beta to the confidence interval of the odds ratio. This happens simply by exponentiating the confidence interval of the beta. For the first line of data: 2.71828^1.081 = 2.949 and 2.71828^1.802 = 6.065.

So, your odds ratio for the first row of the table is 4.23 and it's 95% confidence interval is 2.949-6.065. Because the confidence interval does not include one, the results is statistically significant. The results are subject to error due to rounding of the output from PLINK.

This calculation can be achieved in, e.g., Excel, but below is also an R function that does the same thing (just in case you also use R).

# The data
or<-structure(list(SNP1 = structure(c(1L, 1L, 1L, 1L), .Label = "rs1", class = "factor"), 
SNP2 = structure(c(1L, 1L, 1L, 1L), .Label = "rs2", class = "factor"), 
HAPLOTYPE = c(22L, 12L, 21L, 11L), F = c(0.00992, 0.038, 
0.00015, 0.952), OR = c(4.23, 1.02, 5.22e-10, 0.762), STAT = c(61.5, 
0.217, 453, 22.9), P = c(4.43e-15, 0.642, 1.77e-100, 1.73e-06
)), .Names = c("SNP1", "SNP2", "HAPLOTYPE", "F", "OR", "STAT", 
"P"), class = "data.frame", row.names = c(NA, -4L))

# The function
orci<-function(or) {
   or$beta<-log(or$OR)
   or$se<-abs(or$beta/sqrt(or$STAT))
   or$lower<-or$beta-1.96*or$se
   or$upper<-or$beta+1.96*or$se
   or$LOWER<-exp(or$lower)
   or$UPPER<-exp(or$upper)
   or$res<-paste(or$OR, " (", round(or$LOWER, 3), "-", round(or$UPPER, 3), ")", sep="")
   return(or)
}

# The calculation
orci(or)

# The result
#SNP1 SNP2 HAPLOTYPE       F       OR    STAT         P         beta         se        lower       upper        LOWER        UPPER                  res
#1  rs1  rs2        22 0.00992 4.23e+00  61.500  4.43e-15   1.44220199 0.18390288   1.08175235   1.8026516 2.949844e+00 6.065710e+00    4.23 (2.95-6.066)
#2  rs1  rs2        12 0.03800 1.02e+00   0.217  6.42e-01   0.01980263 0.04251018  -0.06351733   0.1031226 9.384579e-01 1.108627e+00   1.02 (0.938-1.109)
#3  rs1  rs2        21 0.00015 5.22e-10 453.000 1.77e-100 -21.37335353 1.00420775 -23.34160072 -19.4051063 7.292419e-11 3.736538e-09 0.000000000522 (0-0)
#4  rs1  rs2        11 0.95200 7.62e-01  22.900  1.73e-06  -0.27180872 0.05679965  -0.38313603  -0.1604814 6.817202e-01 8.517337e-01  0.762 (0.682-0.852)

Best Answer

Related Solutions

Solved – Computing confidence intervals for count data

Solved – Calculate odds ratio confidence intervals from plink output

Related Question