Solved – What does “case-control” and “cross-sectional” mean in the context of logistic modeling

case-control-studyepidemiologylogisticobservational-studyodds-ratio

While studying logistic modeling, I read the following statement

The fact that only odds ratios, not individual risks, can be estimated from logistic
modeling in case-control or cross-sectional studies is not surprising.

I do not know what do the "case-control" and "cross-sectional studies" stand for in the statistical analysis? Moreover, I do not quite understand what does the above statement mean from the viewpoint of statistical analysis. Any explanations will be appreciated.

Best Answer

First, the definitions, then a slight twist on the statement you posted, then hopefully an illuminating answer.

Cross-Sectional Study: A study where you take a "snapshot" of a population at a single point in time. You're not following anyone, it's simply a "At this point, do you have or not have a disease" - along with covariates of course. A cross-section - hence the name.

Case-Control Study: A study usually used when a cohort study or RCT is going to be difficult, if not impossible. You sample cases from some source, and then a number of controls, usually in some ratio to the number of cases (1:1, 2:1, etc.). Again, you're not following anyone, you're back tracking. Rather than saying "what exposures lead to disease" you're asking "what exposures are more common in the group that got disease?".

What the statement means is that in either case, you're limited to what you can estimate. In order to calculate a risk (and thus a risk ratio) you need to know of a population n with no diseased people, how many people would get disease in your follow-up period (incidence). In a cross-sectional study, you technically only have prevalence, not incidence. This is the twist - the statement you posted is technically wrong. You can also - and often should - estimate a Prevalence Ratio from a cross-section study, as well as an Odds Ratio.

In a case-control study, you don't have the population - you just have the cases, and a basket of non-cases - you have no idea what happened in population n. So while you can calculate odds, its literally impossible to calculate the risk, it requires information you do not have.

However, in cases where disease is rare (~<10% prevalence), the Odds Ratio should approximate the risk ratio for a similarly conducted cohort study.

What this all means statistically is that these relatively simplistic (and thus fairly flexible) study designs are somewhat restrictive in what you can do - you're largely confined to logistic regression and the calculation of an odds ratio.