R – Calculating Risk Ratio with CI from Odds Ratios

logisticrregressionrelative-risk

I have performed a multiple logistic regression because I wanted to see the association between Death and Cardiovascular disease. I adjusted using age, sex, risk factors.
The result came in ODDS RATIO with CONFIDENCE INTERVALS.

How do I do this using RISK RATIO with CONFIDENCE INTERVAL instead?

If I just do the ratio between exposed and non-exposed then I don't adjusted for age, sex, risk factors anymore and also I don't have confidence interval.

Please help with codes and method
I use R

Best Answer

This is fairly easily done with a marginal effect in R.

However, we should warn you that the design of the study is incredibly important. If the design is a case control (where cases are purposefully oversampled) then the relative risk calculation is biased precisely because the frequency of cases was biased (by design).

Let's assume you ran a cohort study and hence relative risks are indeed allowable. The first thing to understand is that logistic regression can predict the risk conditional on age and sex. Let age be represented by $x$ and sex by $w$. Your model is

$$ p(x, w) = \dfrac{1}{1 + \exp(-(\beta_0 + \beta_1x + \beta_2w))} $$

Here, the $\beta$ are the log odds ratios. So given someone who is $\delta$ years older than some reference patient, the relative risk is

$$ \dfrac{p(x+\delta, w)}{p(x, w)} $$

Note that the relative risk is then going to depend on the denominator, so there will not be a single relative risk, there will be a distribution of them, and the relative risk depends on the combination of age and sex. However, we can calculate an average marginal effect and report a confidence interval for that. Here is how we would do that in R (you will need to install the {marginaleffects} package).

library(tidyverse)
library(marginaleffects)
#> Warning: package 'marginaleffects' was built under R version 4.2.2
set.seed(0)
N <- 250
# Imagine a rescaled age variable
age <- rnorm(N, 0, 1)
sex <- rbinom(N, 1, 0.49)

p <- plogis(-2 + 0.2*age + 0.1*sex)
y <- rbinom(N, 1, p)

# Your model
fit <- glm(y ~ age + sex, family = binomial())


avg_comparisons(
  model=fit,
  # This next part computes the relative risk
  # Else, the risk difference is returned.
  transform_pre = 'lnratioavg',
  transform_post = exp
)
#> 
#>  Term              Contrast Estimate Pr(>|z|) 2.5 % 97.5 %
#>   age mean(+1)                  1.22    0.166 0.920   1.62
#>   sex ln(mean(1) / mean(0))     1.24    0.461 0.701   2.19
#> 
#> Prediction type:  response 
#> Columns: type, term, contrast, estimate, p.value, conf.low, conf.high, predicted, predicted_hi, predicted_lo

Created on 2023-02-24 by the reprex package (v2.0.1)

It is worth breaking this down. What this output is telling me is that when I increase age by 1 unit (here 1 standard deviation since I've used a rescaled age with 0 mean and standard deviation 1) then the relative risk is 1.22 (or 22% increase to the risk). The 95% CI is also provided. However, that is the AVERAGE. Remember, the relative risk depends on your age and your sex.

Here is a histogram of estimated relative risks for each patient in these data.

comparisons(
  fit,
  variables = 'age',
  transform_pre = 'lnratio',
  transform_post = exp
) %>% 
  as.data.frame() %>% 
  ggplot(aes(estimate)) + 
  geom_histogram()

enter image description here

We can also estimate the relative risk conditional on age and sex at the same time like this

comparisons(
  fit,
  newdata = datagrid(age=-seq(-3, 3, 0.1), sex=0:1),
  variables = 'age',
  transform_pre = 'lnratio',
  transform_post = exp
) %>% 
  as.data.frame() %>% 
  ggplot(aes(age, estimate, color=factor(sex))) + 
  geom_line()

So given the age of a patient and their sex, you can report the relative risk associated with (in this case) a 1 year increase to their age.

enter image description here

Its worth repeating that the validity of these approaches depends almost entirely on the design of the study. All of this is useless if the study is a case control since the baseline risk of the outcome is biased.

Related Question