Solved – Incidence rate ratios (Stata vs R)

epidemiologyincidence-rate-ratiorstata

I asked this question on Stackoverflow:
https://stackoverflow.com/questions/8142118/incidence-rate-ratios-in-r-poisson-regression
and was advised to post here instead.
I have data that looks like this

   sex agecat  cases population

1 male    0-4  12     126526
2 male    5-9  12     128375
3 female  0-4  11     129280
4 male    10-14 4     127910
5 female  0-4  13     127158
6 male    0-4  8      125125

I want to duplicate the output I get in stata with this command

poisson cases i.agecat, exp(pop) irr

which gives output such as:

   cases |        IRR   Std. Err.      z    P>|z|     [95% Conf. Interval]
---------+----------------------------------------------------------------
  agecat |
      2  |   .5125755   .0530442    -6.18   0.000     .4578054    .6669639
      3  |    .323456   .0381304    -9.60   0.000     .2665044    .4172274
population | (exposure)

in R with a command such as

glm(cases~agecat, family = poisson(link = "log")

I know I need to exponentiate the coefficients and confidence intervals, but I think I also need some kind of offset so so that the intercept is zero; and adjust for per unit population vs baseline.

Can anyone help/advise ?

Thanks
EDIT: The question on SO has been answered, but I posted more detail here. In particular, I think the issue has to do with adjusting for population size in stata with exp(pop) – and how to replicate this in R.

Best Answer

The Stata option exp(pop) includes log(pop) as an offset term in the linear predictor, so the R equivalent should be offset=log(pop).