Exact Confidence Interval for Relative Risk – Calculation Guide

confidence intervalepidemiologyrelative-risk

I am working on some MRSA data and need to calculate the relative risk of a group of hospitals compared with the remaining hospital.

My colleagues throws me an excel with a formula inside to calculate the "exact confidence interval of relative risk", I can do the calculation without difficulties, but I have no idea on how and why this formula is used for do such calculation.

I have attached the excel file here for your reference.

Can anyone show me a reference on the rationale of the calculation? Article from textbooks will be fine to me. Thanks!

Best Answer

Check out the R Epi and epitools packages, which include many functions for computing exact and approximate CIs/p-values for various measures of association found in epidemiological studies, including relative risk (RR). I know there is also PropCIs, but I never tried it. Bootstraping is also an option, but generally these are exact or approximated CIs that are provided in epidemiological papers, although most of the explanatory studies rely on GLM, and thus make use of odds-ratio (OR) instead of RR (although, wrongly it is often the RR that is interpreted because it is easier to understand, but this is another story).

You can also check your results with online calculator, like on statpages.org, or Relative Risk and Risk Difference Confidence Intervals. The latter explains how computations are done.

By "exact" tests, we generally mean tests/CIs not relying on an asymptotic distribution, like the chi-square or standard normal; e.g. in the case of an RR, an 95% CI may be approximated as $\exp\left[ \log(\text{rr}) - 1.96\sqrt{\text{Var}\big(\log(\text{rr})\big)} \right], \exp\left[ \log(\text{rr}) + 1.96\sqrt{\text{Var}\big(\log(\text{rr})\big)} \right]$, where $\text{Var}\big(\log(\text{rr})\big)=1/a - 1/(a+b) + 1/c - 1/(c+d)$ (assuming a 2-way cross-classification table, with $a$, $b$, $c$, and $d$ denoting cell frequencies). The explanations given by @Keith are, however, very insightful.

For more details on the calculation of CIs in epidemiology, I would suggest to look at Rothman and Greenland's textbook, Modern Epidemiology (now in it's 3rd edition), Statistical Methods for Rates and Proportions, from Fleiss et al., or Statistical analyses of the relative risk, from J.J. Gart (1979).

You will generally get similar results with fisher.test(), as pointed by @gd047, although in this case this function will provide you with a 95% CI for the odds-ratio (which in the case of a disease with low prevalence will be very close to the RR).

Notes:

  1. I didn't check your Excel file, for the reason advocated by @csgillespie.
  2. Michael E Dewey provides an interesting summary of confidence intervals for risk ratios, from a digest of posts on the R mailing-list.
Related Question