Solved – Logistic Regression on Time Series Data

linearlogisticregressiontime seriestrend

I would like to forecast the probability of a binary outcome using logistic regression at t+1, using all previous data points. I am new to forecasting so any help would be appreciated.

The raw data is in the form:

Time | Correct
1      0
1      1
2      1
3      0
3      0
3      1
4      1
5      0

I have averaged the data over each day to produce:

Time | AVG. Correct
1      0.5
2      1
3      0.3
4      1
5      0

I know that there is a linear trend in the data, such that as time progresses the average correct value increases.

Using this information would it be possible to use logistic regression to forecast the next time step (t = 6). How would you account for the linear trend in the data for a logistic regression model?.

Best Answer

You could fit a simple logistic regression model and include time as a covariate, this would imply a linear time trend.

Note that in the regression, the time trend is negative and insignificant – you simply have too few observations to make any statements regarding the coefficient of a linear time trend.

See this R-code:

# data input
dat <- data.frame(time=c(1,1,2,3,3,3,4,5),
                  correct=c(0,1,1,0,0,1,1,0))

res <- glm(correct ~ time + 1, data=dat, family=binomial)

summary(res)
# the time trend is negative and insignificant!

# predict 'correct' probability at time t=6
predict(res, newdata=data.frame(time=6), type="response")

# 0.2710599 
Related Question