Logistic Regression – Are Exact Logistic Regression and Conditional Logistic Regression the Same?

conditionalexact-testlogistic

I have seen these two terms in practice. Are they actually referring to the same method? If not, what is the main difference between the two methods?

Conditional logistic regression is commonly used in case control studies, where matched case and control subjects are compared using this model adjusted for matching covariates.
For example,
https://www.thelancet.com/journals/eclinm/article/PIIS2589-5370(20)30318-7/fulltext

I have seen exact logistic regression being proposed in some clinical trial protocols for some binary outcome with small sample size. But for conditional logistic regression used in case control studies, I am not sure if it is for small sample size. Therefore, I am not sure if the two methods are referring to the same thing.

Best Answer

Ok, context.

There are two statistical problems being solved. First, the maximum likelihood estimators for logistic regression are inconsistent if the number of parameters grows too fast with the sample size. At the extreme, with binary matched pair data $Y_{ij}$ $j=0,1$ and the model $$\textrm{logit}\, P(Y_{ij}=1|X_{ij}=x)=\alpha_i+\beta x$$ the MLE of $\beta$ converges to $2\beta$.

The solution to this problem is a conditional likelihood: condition on $ Y_{i.}=\sum_j Y_{ij}$. The maximum conditional likelihood estimator of $\beta$ is consistent and has all the usual nice properties. The conditional likelihood has the same form as the Cox partial likelihood in survival, and a popular way to implement conditional logistic regression involves calling Cox regression routines under the hood.

The usual inference in conditional logistic regression uses the asymptotic $\chi^2$ distribution of the log likelihood ratio or the asymptotic Normal distribution of $\hat\beta$. It can be a bit approximate in small samples (though the likelihood-based tests and intervals are pretty good). However, when the $x$s are discrete, one can write down the exact distribution of $Y_{ij}||X, Y_{i.}$. It's possible to do exact tests (like more complicated versions of Fisher's exact test). With more effort, it's possible to do point and interval estimation with the exact conditional likelihood.

In most of statistics, it's not all that common that you have small samples, large enough effect sizes to see associations anyway, and a pressing need to know precisely whether your $p$-value is just above or just below 0.05. But some people do want this sort of thing; in particular, there is genuine demand for it in pharmaceutical clinical trials.

Cytel Software make a program, LogXact, that does exact conditional inference for logistic regression, as well as some Monte Carlo methods and some 'almost-exact' methods involving saddlepoint approximations. Some saddlepoint methods were also published and implemented by Bellio and Brazzale, first for S-PLUS and then for R; Brazzale and Reid and Davison have a very nice book on the topic.

Related Question