Time-Series Logistic Regression – Using Predictor Data Effectively

logisticpredictive-modelstime series

I like to know if we can model binary outcome with time series predictors. For example lets say Y is binary. $X_1, X_2, X_3,…,X_n$ is the same predictor variable but is a historical snapshot over time period $1,…,n$. I am interested in predicting $Y$ but include the auto correlation among $X_1,…,X_n$ and seasonality if any among $X_1,…,X_n$.

Best Answer

Your question sounds very much like you are interested in discrete time event history analysis (aka discrete time survival analysis, aka a logit hazard model) to answer the question whether and when will an event occur?

For example, equation 1 gives the logit hazard where discrete time periods (up to period $T$ are indicated $d_{1}, \dots, d_{T}$, and you may condition your model on $p$ number of predictors $X_{1}, \dots, X_{p}$. This gives you a hazard estimate as in equation 2. These equations specify a conditional hazard function with a fully discrete parameterization of time. Although you could instead specify a conditional hazard function that is constant over time, or is a linear or polynomial function of time period, or even a hybrid of polynomial functions of period plus some discrete time indicators. Your predictors can be constant over time, or time-varying, so I see no reason why you could not also include lagged or differenced functions of the predictors to model auto-correlation.

  1. $\mathrm{logit}\left(h\left(t,{X_{1t},\dots,X_{pt}}\right)\right) = \alpha_{1}d_{1} + \dots + \alpha_{T}d_{T} + \beta_{1}X_{1t} + \dots + \beta_{p}X_{pt}$

  2. $\hat{h}\left(t,{X_{1t},\dots,X_{pt}}\right) = \frac{e^{\hat{\alpha}_{1}d_{1} + \dots + \hat{\alpha}_{T}d_{T} + \hat{\beta}_{1}X_{1t} + \dots + \hat{\beta}_{p}X_{pt}}}{1 + e^{\hat{\alpha}_{1}d_{1} + \dots + \hat{\alpha}_{T}d_{T} + \hat{\beta}_{1}X_{1t} + \dots + \hat{\beta}_{p}X_{pt}}}$

One need not use a logit hazard model (indeed one could use probit, complimentary log-log, robit, etc. binomial link functions).

If you are using Stata see also the dthaz package by typing net describe dthaz, from(https://alexisdinno.com/stata).

References

Singer, J. and Willett, J. (1993). It’s about time: Using discrete-time survival analysis to study duration and the timing of events. Journal of Educational and Behavioral Statistics, 18(2):155–195.

Singer, J. D. and Willett, J. B. (2003). Applied longitudinal data analysis: Modeling change and event occurrence. Oxford University Press, New York, NY.