Solved – Simple interrupted time series analysis

intervention-analysisrtime series

I have a weekly time series representing costs for a cohort. I want to tell whether an intervention on the cohort (we can assume it happened in a single week) has decreased costs for the cohort. I happen to know that the trend over this period for the population from which this cohort was taken was -120 per week per week.

My initial thought was simply to do a linear regression lm(Costs~Weeks,offset=-120*Weeks) but (obviously) the significance is not only a function of the effect of the intervention but also how far back I look (if I look back to $-\infty$ it will of course appear non-significant).

I looked at this website: http://www.r-bloggers.com/time-series-intervention-analysis-wih-r-and-sas/ and tried to replicate the R code with my data, but when I enter the arimax() command, I got the error message

Error in stats:::arima(x=x,order=order,seasonal=seasonal,fixed=par[1:narma], : wrong length for 'fixed'

Now, I'm not sure what to do. Can anyone give me some guidance?

Best Answer

here's the arima function in R.

http://svn.r-project.org/R/trunk/src/library/stats/R/arima.R

snippet you might be interested in:

if (is.null(fixed)) fixed <- rep(NA_real_, narma + ncxreg)
else if(length(fixed) != narma + ncxreg) stop("wrong length for 'fixed'")
mask <- is.na(fixed)
no.optim <- !any(mask)
if(no.optim) transform.pars <- FALSE
if(transform.pars) {
    ind <- arma[1L] + arma[2L] + seq_len(arma[3L])
    if (any(!mask[seq_len(arma[1L])]) || any(!mask[ind])) {
        warning("some AR parameters were fixed: setting transform.pars = FALSE")
        transform.pars <- FALSE
    }
}

Related Solutions

Time-Series – Quantifying Intervention Effect in Time Series Analysis

In general, evaluation of pre-post effects in time-series analysis is called interrupted time series. This is a very general modeling approach that tests the strong hypothesis:

$\mathcal{H}_0: \mu_{ijt} = f_i(t)$ versus $\mathcal{H}_1 : \mu_{ijt} = f_i(t) + \beta(t)X_{ijt} $

Where $X_{ijt}$ is the the treatment assignment for individual $i$ at time $t$. The easiest example is treating $\beta$ as a constant function and $X_{ijt}$ as a 0,1 indicator for 0: pre-intervention 1: peri-or post-intervention. Even if the actual "effect" of the intervention is different than this, this test is powered to detect differences in many types of scenarios, for instance, if $\beta(t)$ is any non-zero function, then a working constant parameter $\beta$ will estimate a time-averaged positive response to intervention and is non-zero.

A challenge in time-series analysis of pre-post interventions is using a parametric modeling approach for the auto-correlation. With many replicates of time and function, one can decompose the trend into lagged effects, seasonal effects etc. This would obviate the need for autocorrelation in the error term. Therefore it is not necessary to use forecast, but the model itself directly predicts what would have been observed in the post-intervention time period.

Consider the famous Air Passengers data in the datasets package in R.

## construct an analytic dataset to predict time trend using auto-regressive and seasonal components
AirPassengers <- data.frame('flights'=as.numeric(AirPassengers))
AirPassengers$month <- factor(month.name, levels=month.name)
AirPassengers$year <- rep(1949:1960, each=12)
AirPassengers$lag <- c(NA, AirPassengers$flights[-nrow(AirPassengers)])

plot(AirPassengers$flights, type='l')

AirPassengers$fitted <- exp(predict(lm(log(flights) ~ month + year, data=AirPassengers)))
lines(AirPassengers$fitted, col='red')

It's obvious this provides an excellent prediction of the time based trends. If, though, you were interested in a test of hypothesis as to whether "flying increased" post, say, 1955, you can update the dataset to include a 0/1 indicator for whether or not the time period is post that point and test its significance in a linear model.

For example:

library(lmtest)
library(sandwich)
AirPassengers$post <- AirPassengers$year >= 1955
fit <- lm(log(flights) ~ month + year + post, data=AirPassengers)
coeftest(fit, vcov. = vcovHC)['postTRUE', ]

Gives me:

> coeftest(fit, vcov. = vcovHC)['postTRUE', ]
  Estimate Std. Error    t value   Pr(>|t|) 
0.03720327 0.01783242 2.08627126 0.03890842

Which is a nice example of a spurious finding, and a statistically significant effect that isn't practically significant. A more general test could be had by allowing heterogeneity between the month specific effects.

nullmodel <- lm(log(flights) ~ month + year, data=AirPassengers)
fullmodel <- lm(log(flights) ~ post*month + year, data=AirPassengers)
waldtest(nullmodel, fullmodel, vcov=vcovHC, test='Chisq')

Both of these are examples of the general approach to "interrupted time series" for segmented regression. It is a loosely defined term and I'm a little disappointed with how little detail the authors use in describing their exact approach in most cases.

Solved – Longitudinal analysis with intervention

You would think that a simple question like this would have received more attention in the literature but .... To determine an anomaly one needs to have a model which characterizes typical behavior. I took your 336 values and simply graphed them and obtained very little visual support for any activity on or around period 169 BUT simple visual checking is equivalent to a simple mean model. . I then used AUTOBOX (my tool of choice which I had helped develop) to simultaneously to identify an appropriate memory model and any exceptional activity. Following is a graph of the actual,fit and forecast using that model. and model suggesting three level shifts and some 14 anomalies while incorporating a very significant AR(1) component, Any thorough analysis of time series data includes a plot of the residuals presented here and the acf of the residuals reflecting/suggesting apparent model sufficiency. In conclusion it appears to me (and AUTOBOX ) that period 169 is not suggestive of any exceptional activity. Hope this helps you and other interested readers.

Best Answer

Related Solutions

Time-Series – Quantifying Intervention Effect in Time Series Analysis

Solved – Longitudinal analysis with intervention

Related Question