Solved – Is conditional logit a specific form of GLM? And what are its specificities

clogitgeneralized linear modellogisticrregression

Background: For a project, I am fitting a conditional logit model where I have 5 control cases for every realized case. To do that I use the clogit() function in the package survival. I wanted to graph interactions with the effects package by John Fox et al. It turns out that this package can't handle clogit objects (output of clogit()).

As I believed I remembered that conditional logit were a special case of GLM, I thought the clever/lazy way to get my interaction plots would be to refit the model using a fixed effects glm and then use effect().
The documentation of clogit seemed to confirm my intuition:

It turns out that the logliklihood for a conditional logistic regresson model = loglik from a Cox model with a particular data structure. […]
When a well tested Cox model routine is available many packages use this ‘trick’ rather than writing a new software routine from scratch, and this is what the clogit routine does.

In detail, a stratified Cox model with each case/control group assigned to its own stratum, time set to a constant, status of 1=case 0=control, and using the exact partial likelihood has the same likelihood formula as a conditional logistic regression. The clogit routine creates the necessary dummy variable of times (all 1) and the strata, then calls coxph.

Based on this description, it seems that I should be able to reproduce the stratification achieved through strata() by using a random intercept for each case/control group with 1|group in lmer(). However, when I try, the results of clogit and lmer differ. One thing is that I probably have the wrong likelihood function. I don't really know how to specify this in lmer but more important, I am wondering what else I am missing.

I wonder whether I am completely wrong or somewhat on the right track but missing some pieces? What I would like is to understand what are the difference in terms of how the model is fitted between a conditional logit and a regular one (I understand that might be quite a long answer, so a book reference would be a great start). The my usual references for regression (Gelman and Hill, 2007; Mills 2011) are somewhat silent on the subject.

Best Answer

Your reference says that clogit is a special form of Cox regression, not the GLMM. So you are probably mixing things up.

The conditional logit log-likelihood is (reverse engineering the LaTeX code from the Stata manual): conditional on $\sum_{j=1}^{n_i} y_{ij} = k_{1i}$, $$ {\rm Pr}\Bigl[(y_{i1},\ldots,y_{i{n_i}})|\sum_{j=1}^{n_i} y_{ij} = k_{1i}\Bigr] = \frac{\exp(\sum_{j=1}^{n_i} y_{ij} x_{ij}'\beta)}{\sum_{{\bf d}_i\in S_i}\exp(\sum_{j=1}^{n_i} y_{ij} x_{ij}'\beta)} $$ where $S_i$ is a set of all possible combinations of $n_i$ binary outcomes, with $k_{1i}$ ones and remaining zeroes, so the summation index-vector has components $d_{ij}$ that are 0/1 with $\sum_{i=1}^{n_i} d_{ij} = k_{1i}$. That's a pretty weird likelihood to me. Denoting the denominator as $f_i(n_i,k_{1i})$, the conditional log-likelihood is $$ \ln L = \sum_{i=1}^n \biggl[ \sum_{j=1}^{n_i} y_{ij} x_{ij}'\beta - \ln f_i(n_i, k_{1i}) \biggr] $$ This likelihood can be computed exactly, although the computational time goes up steeply as $p^2 \sum_{i=1}^n n_i \min(k_{1i}, n_i - k_{1i})$ where $p={\rm dim}\, \beta = {\rm dim}\, x_{ij}$. This is the likelihood that should be identical to the stratified Cox regression, which I won't try to entertain here.

The mixed model likelihood (again, adopting from Stata manuals) is based on integrating out the random effects: $$ {\rm Pr}(y_{i1}, \ldots, y_{1{n_i}} |x_{i1}, \ldots, x_{i{n_i}})=\int_{-\infty}^{+\infty} \frac{\exp(-\nu_i^2/2\sigma_\nu^2)}{\sigma_\nu \sqrt{2\pi}} \prod_{i=1}^{n_i}F(y_{ij}, x_{ij}'\beta + \nu_i) $$ where $ F(y,z) = \Bigl\{ 1+\exp\bigl[ (-1)^y z \bigr] \Bigr\}^{-1} $ is a witty way to write down the logistic contribution for the outcome $y=0,1$. This likelihood cannot be computed exactly, and in practice is approximated numerically using a set of Gaussian quadrature points with abscissas $a_m$ and weights $w_m$ resembling the density of the standard normal density on a grid, producing (in the simplest version) $$ \ln L \approx \sum_{i=1}^n \ln\biggl[ \sqrt{2} \sum_{m=1}^M w_m \frac{1}{\sigma_\nu \sqrt{2\pi}} \prod_{i=1}^{n_i}F(y_{ij}, x_{ij}'\beta + \sqrt{2} \sigma_\nu a_m) \biggr] $$ (The $\exp(\nu_i^2)$-like terms disappear due to the full quadrature formula, but since it is designed for the physicist' erf() function rather than statisticians' $\Phi()$ function, it works with $\exp(-z^2)$ rather than $\exp(-z^2/2)$; hence the weird $\sqrt{2}$ in a couple of places.) Computational time for $\ln L$ itself is proportional to $nM$, but since you need to take the second order derivatives for Newton-Raphson, feel free to multiply by $p^2$. Smarter computational schemes aka adaptive Gaussian quadratures try to find a better location and scale parameters for the quadrature to make the approximation more accurate.

In fact, that latter Stata manual describes the differences between the GLMM (aka random effect xtlogit, in econometric slang) and conditional logit (aka fixed effect xtlogit), and might be worth a more serious reading.

Related Solutions

Solved – How to decide between a logistic regression or conditional logistic regression

I don't agree that you sampled on the outcome, since you sampled on company and enrollment is your outcome. You may want to deal with the company as a random effect and the other features as fixed effects. So I am suggesting yet a third alternative: generalized mixed models.

After clarification: If the outcome is company enrollment rather than employee enrollment, then it is an ordinary case-control study for which unconditional logistic regression should be the standard approach. Conditional logistic regression is not necessary unless there were further conditions on the sampling regarding other company features.

Further clarification: If you were using R, then the package to identify and install would be not surprisingly: "sampling" by Thomas Lumley. It provides for the appropriate incorporation of the two-way sampling strategy you have outlined in the design phase prior to estimation with the svyglm() function. Stata also has a set of survey functions and I imagine they can also be used with the general linear modeling functions it provides. SAS didn't have such facilities in the past so the SUDAAN program was needed as an added (expensive) purchase, but I have a vague memory that this may have changed with its latest releases. (I don't know about SPSS with regard to sampling support for GLM models.)

Logistic Regression – Relationship Between McNemar’s Test and Conditional Logistic Regression

Sorry, it's an old issue, I came across this by chance.

There is a mistake in your code for the mcnemar test. Try with:

n <- 100
do.one <- function(n) {
  id <- rep(1:n, each=2)
  case <- rep(0:1, times=n)
  rs <- rbinom(n*2, 1, 0.5)
  c(
    'pclogit' = coef(summary(clogit(case ~ rs + strata(id))))[5],
    'pmctest' = mcnemar.test(table(rs[case == 0], rs[case == 1]))$p.value
  )
}

out <- replicate(1000, do.one(n))

enter image description here

Best Answer

Related Solutions

Solved – How to decide between a logistic regression or conditional logistic regression

Logistic Regression – Relationship Between McNemar’s Test and Conditional Logistic Regression

Related Question