Logit – Binary Models (Probit and Logit) with a Logarithmic Offset Analysis

logarithmlogitoffsetprobit

Does anyone have a derivation of how an offset works in binary models like probit and logit?

In my problem, the follow-up window can vary in length. Suppose patients get a prophylactic shot as treatment. The shot happens at different times, so if the outcome is a binary indicator of whether any flare-ups happened you need to adjust for the fact that some people have more time to exhibit symptoms. It seems that the probability of a flare-up is proportional to the length of the follow-up period. It's not clear to me mathematically how a binary model with an offset captures this intuition (unlike with the Poisson).

The offset is a standard option in both Stata (p.1666) and R, and I can easily see it for a Poisson, but the binary case is a bit opaque.

For example, if we have
\begin{equation}
\frac{E[y \vert x]}{Z}=\exp\{x'\beta\},
\end{equation}
this is algebraically equivalent to a model where
\begin{equation}E[y \vert x]=\exp\{x'\beta+\log{Z}\},
\end{equation}
which is the standard model with the coefficient on $\log Z$ constrained to $1$. This is called a logarithmic offset. I am having trouble figuring out how this works if we replace $\exp\{\}$ with $\Phi()$ or $\Lambda()$.

Update #1:

The logit case was explained below.

Update #2:

Here's an explanation of what seems to be the main use of offsets for the non-poisson models like probit. The offset can be used to conduct likelihood ratio tests on index functions coefficients. First you estimate the unconstrained model and store the estimates. Say you want to test the hypothesis that $\beta_x=2$. Then you create the variable $z=2 \cdot x$, fit the model dropping $x$ and using $z$ as an non-logarithmic offset. This is the constrained model. The LR tests compares the two, and is an alternative to the usual Wald test.

Best Answer

You can always include an offset in any GLM: it's just a predictor variable whose coefficient is fixed at 1. Poisson regression just happens to be a very common use case.

Note that in a binomial model, the analogue to log-exposure as an offset is just the binomial denominator, so there's usually no need to specify it explicitly. Just as you can model a Poisson RV as a count with log-exposure as an offset, or as a ratio with exposure as a weight, you can similarly model a binomial RV as counts of successes and failures, or as a frequency with trials as a weight.

In a logistic regression, you would interpret a $\log Z$ offset in terms of the odds ratios: a proportional change in $Z$ results in a given proportional change in $p/(1-p)$.

$$\begin{equation}\begin{split} \log (p/(1-p)) &= \beta' \mathrm{X} + \log Z \\ p/(1-p) &= Z \exp(\beta' \mathrm{X}) \end{split}\end{equation}$$

But this doesn't have any particular significance like log-exposure does in a Poisson regression. That said, if your binomial probability is small enough, a logistic model will approach a Poisson model with log link (since the denominator on the LHS approaches 1) and the offset can be treated as a log-exposure term.

(The problem described in your linked R question was rather idiosyncratic.)

Related Question