Solved – Poisson regression vs log-linear model

log-linearpoisson-regression

I'm confused about the difference between log-linear model and poisson regression and I am not sure which one to use to answer my research question.

In the experiment, participants were grouped into young/old, treatment/no treatment, white/non-white. Researchers were to collect questionnaires every 2 weeks. It turned out the number of questionnaire collected were less than the original goal. I would like to know if the missingness has something to do with ethnicity, treatment, changes in protocol or interaction between these factors.

My instinct would be to use a Poisson regression with number of questionnaire as the outcome, and ethnicity, treatment and protocol change as the predictors. However, my mentor told me to use a log-linear model to examine the association between these factors. My understanding is that log-linear model examines expected cell counts in n-way contingency table. Can log-linear model answer this question? If yes, does that mean I have to look at the interaction between number of questionnaire and ethnicity, treatment or protocol change in a 4-way table?

When searching online, some people also used log-linear model and poisson regression interchangeably. Are they actually the same thing under certain circumstances?

Thanks!

Best Answer

A log-linear model indeed serves to analyze contingency tables, whereby the mean is modeled through a log link, i.e. as product of factors. These factors can be your margin totals, but also or your regressors such as ethnicity indeed.

However in order to fit a log-linear model you need to specify an error distribution, usually a Poisson. Log-linear analysis is thus simply Poisson regression applied to contingency tables.

Regarding your analysis, I would prefer a logistic regression model that predicts for every questionnaire the probability of being returned or not. You can use your regressors as intended (as fixed effects), and add a random effect for participant. The random effect accounts for heterogeneity of your population, and models dependency between questionnaire collection on the same individual. This approach allows you to take time-varying variables such as change of protocol into account. It zooms in on every single questionnaire instead of their sum. If this approach is impossible because you only have the totals than you can go ahead with Poisson regression as well.