Solved – Does dumthe intervention variable (pulse or step) must be differenced when it is added to ARIMA model

I have read some opinions from this forum and from other sources that when the dependent variable in any from of ARIMA model (whether ARIMA errors, ARIMAX or transfer function)is differenced, you should difference all covariates as well, including dummy variables. At the same time, by reading different ARIMA analyses/discussions etc, it seems that it is usual practice not to difference intervention variables. Unfortunately I have not found a good material that would tackle this issue directly and thoroughly. So, my question is, do you necessarily have to difference dummy variables in ARIMA model and how this affects the intrepretation of estimated coefficients?

Thanks for the comment. But being more specific with regard with coefficient interpretations, let`s assume that we have ordinary regression model:

enter image description here

My point is that when we have Dt as specified above (i.e. linear trend starting at some point) in the original model, then after first differences we should get usual intervention dummy but now it is in the differenced model. Just like we get intercept in the differenced model from the trend variable.
So, from this I derive that when I difference all the variables in regression model and then add usual intervention dummy without differencing it (i.e. 0 before intervention and 1 after intervention), I should interpret estimated coefficient of that dummy as the coefficient of Dt specified above in the original model. For instance, let`s say the value of dummy coefficient in differenced model will be estimated as 0.5, then it means that intervention has initiated a positive linear trend, i.e. the value of dependent variable increases by 0.5 unit each period after the intervention.

On the other hand, if I add differenced intervention dummy (that in practice means the value of 1 only in the intervention period and 0 otherwise), I interpret it in usual way. Specifically, if the coefficient value will be estimated as 0.5 again, it means that the mean level of dependent variable has increased by 0.5 unit, i.e. there has been a level shift.

Is my logic correct? And if it is, it should apply to the model of regression with ARIMA errors as well.

EDIT:The simulations below are interesting experiments and helped me on little bit but I am still confused in regard with interpreting the coefficients under different dummy specifications. But couldn't the logic be as follows. Let`s take the famous and often cited Box and Tiao (1975) model for LA Oxidant Data. They specify the following model:
enter image description here

I am not completely sure but as I understand it, they apply seasonal differences on dependent variable and on the first intervention variable $\xi_{t1}$. They do not difference second and third intervention variables $\xi_{t2}$ and $\xi_{t3}$ because their model could be equivalently represented as follows:

enter image description here

In their article, they also classify alternative transfer functions and one of them, called the “ramp” response, seems to be applied to $\xi_{t2}$ and $\xi_{t3}$ variables. This “ramp” response, to my understanding, is exactly the case under which the step dummy (without differencing it) is applied to differenced data. I copy here its visual appearance from Box and Tiao (1975):

enter image description here

It actually looks something I described above, i.e. there is a linear trend starting from intervention period (or from other period, if necessary, specified by the backshift operator in the numerator).
So, I am wondering if my logic is correct if I state that when the step dummy is applied to differenced data, it is the ramp response model and if differenced step dummy is applied to differenced data it should be interpreted as usual immediate step change? Or am I mixing something up…

Best Answer

If you assume that the primary source of variation is the ARIMA component and identify an ARIMA model ( which assumes no interventions) and then you try and identify interventions ( which had been proscribed by the first step) then if you identify an intervention .... I suggest that you difference the identified intervention by any differencing identified in the first step. The point here is that sometimes ( if not often ) the primary source of variability is not the ARIMA model but either Intervention Variables or Causal Variables then one needs to be more dilligent.

EDIT: Now that I see that you are referring to trend series (the counting numbers) , I think I have a bettter read on your question.One good way to answer your question is to actually simulate alternative cases via Monte Carlo and either manually try to form the model or use advanced search procedures. I simulated three cases and for simplicity will present them with pseudo values/data. I used AUTOBOX , a piece of commercially available software which I have helped develop, to both simulate and to analyze the data. Of course the analysis side knew nothing about how the data was simulated

a simple shift in the mean(intercept) in a single slope trended series 1,2,3,4,5,..28,29,30,40,41,42,... AUTOBOX automatically identified the model as y(t)=constant + LS at period 31 + a(t)/[1-B] when the series shifted upwards (+10) but retained the same slope (constant=1)
a shift in the mean and a shift in trend 1,2,3,4,5,..28,29,30,34,38,42,46.... AUTOBOX automatically detected a model with two trends (x1 and x2) ; x1 starting at period 1 and x2 starting at period 34 . The model y(t)=constant + w0*x1 + w1*x2 +a(t) . The two slopes were (w0=1 and w1= 4)

3.a locally constant mean then upwards trending series 1,1,1,1,1,1,1,1,6,11,16,21…… Autobox automatically detected the model y (t)= constant + w0*x1 + w1*x2 + a(t)/[1-B]) X1= a pulse at period 9 w0=2 X2=a step/level at period 9 w1=5 becomes a trend as it gets integrated via the [1-B]