Relative Risk – How to Calculate Confidence Interval for Relative Risk

confidence intervalrelative-risk

I am using the epitools in R for calculating the confidence interval of relative risk.

http://bm2.genes.nig.ac.jp/RGM2/R_current/library/epitools/man/riskratio.html

There are three methods inside for calculations: namely Wald, Small and Boot.

I want to find some article describing the three methods, but I can't find any, can anyone help? Thanks!

Best Answer

The three options that are proposed in riskratio() refer to an asymptotic or large sample approach, an approximation for small sample, a resampling approach (asymptotic bootstrap, i.e. not based on percentile or bias-corrected). The former is described in Rothman's book (as referenced in the online help), chap. 14, pp. 241-244. The latter is relatively trivial so I will skip it. The small sample approach is just an adjustment on the calculation of the estimated relative risk.

If we consider the following table of counts for subjects cross-classififed according to their exposure and disease status,

          Exposed  Non-exposed  Total
Cases          a1           a0     m1
Non-case       b1           b0     m0
Total          n1           n0      N

the MLE of the risk ratio (RR), $\text{RR}=R_1/R_0$, is $\text{RR}=\frac{a_1/n_1}{a_0/n_0}$. In the large sample approach, a score statistic (for testing $R_1=R_0$, or equivalently, $\text{RR}=1$) is used, $\chi_S=\frac{a_1-\tilde a_1}{V^{1/2}}$, where the numerator reflects the difference between the oberved and expected counts for exposed cases and $V=(m_1n_1m_0n_0)/(n^2(n-1))$ is the variance of $a_1$. Now, that's all for computing the $p$-value because we know that $\chi_S$ follow a chi-square distribution. In fact, the three $p$-values (mid-$p$, Fisher exact test, and $\chi^2$-test) that are returned by riskratio() are computed in the tab2by2.test() function. For more information on mid-$p$, you can refer to

Berry and Armitage (1995). Mid-P confidence intervals: a brief review. The Statistician, 44(4), 417-423.

Now, for computing the $100(1-\alpha)$ CIs, this asymptotic approach yields an approximate SD estimate for $\ln(\text{RR})$ of $(\frac{1}{a_1}-\frac{1}{n_1}+\frac{1}{a_0}-\frac{1}{n_0})^{1/2}$, and the Wald limits are found to be $\exp(\ln(\text{RR}))\pm Z_c \text{SD}(\ln(\text{RR}))$, where $Z_c$ is the corresponding quantile for the standard normal distribution.

The small sample approach makes use of an adjusted RR estimator: we just replace the denominator $a_0/n_0$ by $(a_0+1)/(n_0+1)$.

As to how to decide whether we should rely on the large or small sample approach, it is mainly by checking expected cell frequencies; for the $\chi_S$ to be valid, $\tilde a_1$, $m_1-\tilde a_1$, $n_1-\tilde a_1$ and $m_0-n_1+\tilde a_1$ should be $> 5$.

Working through the example of Rothman (p. 243),

sel <- matrix(c(2,9,12,7), 2, 2)
riskratio(sel, rev="row")

which yields

$data
          Outcome
Predictor  Disease1 Disease2 Total
  Exposed2        9        7    16
  Exposed1        2       12    14
  Total          11       19    30

$measure
          risk ratio with 95% C.I.
Predictor  estimate    lower    upper
  Exposed2 1.000000       NA       NA
  Exposed1 1.959184 1.080254 3.553240

$p.value
          two-sided
Predictor  midp.exact fisher.exact chi.square
  Exposed2         NA           NA         NA
  Exposed1 0.02332167   0.02588706 0.01733469

$correction
[1] FALSE

attr(,"method")
[1] "Unconditional MLE & normal approximation (Wald) CI"

By hand, we would get $\text{RR} = (12/14)/(7/16)=1.96$, $\tilde a_1 = 19\times 14 / 30= 8.87$, $V = (8.87\times 11\times 16)/ \big(30\times (30-1)\big)= 1.79$, $\chi_S = (12-8.87)/\sqrt{1.79}= 2.34$, $\text{SD}(\ln(\text{RR})) = \left( 1/12-1/14+1/7-1/16 \right)^{1/2}=0.304$, $95\% \text{CIs} = \exp\big(\ln(1.96)\pm 1.645\times0.304\big)=[1.2;3.2]\quad \text{(rounded)}$.

The following papers also addresses the construction of the test statistic for the RR or the OR:

  1. Miettinen and Nurminen (1985). Comparative analysis of two rates. *Statistics in Medicine, 4: 213-226.
  2. Becker (1989). A comparison of maximum likelihood and Jewell's estimators of the odds ratio and relative risk in single 2 × 2 tables. Statistics in Medicine, 8(8): 987-996.
  3. Tian, Tang, Ng, and Chan (2008). Confidence intervals for the risk ratio under inverse sampling. Statistics in Medicine, 27(17), 3301-3324.
  4. Walter and Cook (1991). A comparison of several point estimators of the odds ratio in a single 2 x 2 contingency table. Biometrics, 47(3): 795-811.

Notes

  1. As far as I know, there's no reference to relative risk in Selvin's book (also referenced in the online help).
  2. Alan Agresti has also some code for relative risk.
Related Question