I have a cohort of patients with different length of follow-up. So far I´m disregarding the time aspect and just need to model a binary outcome-disease/no disease. I usually do logistic regression in these studies, but another collegue of mine asked if Poisson regression would be just as appropriate. I´m not that into poisson and was left uncertain as to what the benefits and disadvantages of doing poisson in this setting would be compared logistic regression. I read Poisson regression to estimate relative risk for binary outcomes and I am still uncertain as to the merits of poisson regression in this situation.

# Poisson Distribution – Poisson VS Logistic Regression

logisticpoisson distribution

#### Related Solutions

An answer to all four of your questions, preceeded by a note:

It's not actually all that common for *modern* epidemiology studies to report an odds ratio from a logistic regression for a cohort study. It remains the regression technique of choice for case-control studies, but more sophisticated techniques are now the de facto standard for analysis in major epidemiology journals like *Epidemiology*, *AJE* or *IJE*. There will be a greater tendency for them to show up in clinical journals reporting the results of observational studies. There's also going to be some problems because Poisson regression can be used in two contexts: What you're referring to, wherein it's a substitute for a binomial regression model, and in a time-to-event context, which is extremely common for cohort studies. More details in the particular question answers:

For a cohort study, not really no. There are some

*extremely*specific cases where say, a piecewise logistic model may have been used, but these are outliers. The whole*point*of a cohort study is that you can directly measure the relative risk, or many related measures, and don't have to rely on an odds ratio. I will however make two notes: A Poisson regression is estimating often a*rate*, not a risk, and thus the effect estimate from it will often be noted as a rate ratio (mainly, in my mind, so you can still abbreviate it RR) or an incidence density ratio (IRR or IDR). So make sure in your search you're actually looking for the right terms: there are many cohort studies using survival analysis methods. For these studies, Poisson regression makes some assumptions that are problematic, notably that the hazard is constant. As such it is much more common to analyze a cohort study using Cox proportional hazards models, rather than Poisson models, and report the ensuing hazard ratio (HR). If pressed to name a "default" method with which to analyze a cohort, I'd say epidemiology is actually dominated by the Cox model. This has its own problems, and some very good epidemiologists would like to change it, but there it is.There are two things I

*might*attribute the infrequency to - an infrequency I don't necessarily think exists to the extent you suggest. One is that yes - "epidemiology" as a field isn't exactly closed, and you get huge numbers of papers from clinicians, social scientists, etc. as well as epidemiologists of varying statistical backgrounds. The logistic model is commonly taught, and in my experience many researchers will turn to the familiar tool over the better tool.

The second is actually a question of what you mean by "cohort" study. Something like the Cox model, or a Poisson model, needs an actual estimate of person-time. It's possible to get a cohort study that follows a somewhat closed population for a particular period - especially in early "Intro to Epi" examples, where survival methods like Poisson or Cox models aren't so useful. The logistic model*can*be used to estimate an odds ratio that, with sufficiently low disease prevalence, approximates a relative risk. Other regression techniques that directly estimate it, like binomial regression, have convergence issues that can easily derail a new student. Keep in mind the Zou papers you cite are both using a Poisson regression technique to get around the convergence issues of binomial regression. But binomial-appropriate cohort studies are actually a small slice of the "cohort study pie".Yes. Frankly, survival analysis methods should come up earlier than they often do. My pet theory is that the reason this isn't so is that methods like logistic regression are easier to

*code*. Techniques that are easier to code, but come with much larger caveats about the validity of their effect estimates, are taught as the "basic" standard, which is a problem.You should be encouraging students and colleagues to use the appropriate tool. Generally for the field, I think you'd probably be better off suggesting a consideration of the Cox model over a Poisson regression, as most reviewers would (and should) swiftly bring up concerns about the assumption of a constant hazard. But yes, the sooner you can get them away from "How do I shoehorn my question into a logistic regression model?" the better off we'll all be. But yes, if you're looking at a study without time, students should be introduced to both binomial regression, and alternative approaches, like Poisson regression, which can be used in case of convergence problems.

## Best Answer

One solution to this problem is to assume that the number of events (like flare-ups) is proportional to time. If you denote the individual level of exposure (length of follow-up in your case) by $t$, then $\frac{E[y \vert x]}{t}=\exp\{x'\beta\}.$ Here a follow-up that is twice as long would double the expected count, all else equal. This can be algebraically equivalent to a model where $E[y \vert x]=\exp\{x'\beta+\log{t}\},$ which is just the Poisson model with the coefficient on $\log t$ constrained to $1$. You can also test the proportionality assumption by relaxing the constraint and testing the hypothesis that $\beta_{log(t)}=1$.

However, it does not sound like you observe the number of events, since your outcome is binary (or maybe it's not meaningful given your disease). This leads me to believe a logistic model with an logarithmic offset would be more appropriate here.