I am looking at the relationship between housing characteristics and a health outcome. To make the example simple, I have data for a continuous predictor (exposure) collected from 1000 homes and health outcomes S (a binary outcome) for 2000 people (1000 couples) living in each of those homes. I would like to look at the relationship between S and E using binary logistic regression. Apart from sharing the same exposure, there is no mechanistic reason to believe that status of partner 1 in the couple can affect the status of partner 2 e.g. its not a transmissible disease etc.
Can I do an ordinary logistic regression? Or must I take into account the fact that people are clustered within homes? If so, why? What syntax would be appropriate in Stata, xtlogit
with i(house)
? or some kind of xtmixed
?
Many thanks
Best Answer
For me, this sounds like a (more or less typical) dyadic data set and I would definitely control for dyadic dependencies (i.e. at the houshold level) via multilevel/structural equation modeling.
David Kenny owns a great website on Dyadic Analysis. He also is co-author of a book on Dyadic Data Analysis that is highly recommanded.
Since you seem to use Stata, I would use the
xtmelogit
command (see here for more information).