Solved – ARIMA, adjustments and intervention analysis

adjustmentarimaintervention-analysistime series

I have very little knowledge of time-series analysis (despite my stat master – didn't do anything else than an introductory course) but now I'm facing a statistical problem whose answer is this very kind of analysis – so would really need a helpful hand.

In a nutshell, I have account monthly sales. In only some of them (1 to 10 or so), a marketing project has been conducted (I call them the Affected accounts); the other accounts remain unaffected (called the Control accounts). The question that I have to answer is basically "did the project have an impact in the Affected sales (but here's the part that annoys me) compared to the Control sales?"
So, as I have a time-series analysis (and a seasonality) I guessed that I had to perform a proc ARIMA (I'm using SAS – and its fantastic feature that I discover and displays an automatically fitted model, Time Series Forecasting System). Here are my questioning:

  • How is an adjustment variable accounted for, in an ARIMA model ? If I make an educated guess from the linear regression model, I assume I just have to enter the variable in the INPUT, as independent / explanatory variables ?
    Also, if I am to enter my variable "Control_Sales" like that, is my adjustment correct ? Shouldn't I do a preliminary transformation of my dependent variable, for example redefining it as (Affected_Sales/Control_Sales) or (Affected_Sales – Control_Sales) ?

  • Besides, I have another series of questions around the Intervention analysis – because the conduct of the projects is never a straightforward, pulse intervention and I cannot really pin them down to a precise type. Here's the best explanation I can provide: The projects, as I already said, are not carried out in all accounts, but only in a very small portion of them. Moreover, the projects can be regarded as marketing campaigns that last several months. They have a start date and an implementation(=end) date. But even from my supervisors, there is disagreement on when we should be starting to assess an effect on the sales – sometimes, the sales start to increase during the campaign, most of the time, the effect is seen after the end of the project, but not straight after, perhaps from 2 to 9 months after… So I'm getting very confused when it comes to modelling the type of intervention (a ramp ? several pulses ?) and I am wondering if it wouldn't be better if I performed an Intervention Detection, to see where the impact actually took place, and in the discussion part of my analysis compare it to the reported, "theoretical" implementation date. I deeply believe this idea would be the most appealing, due to the fact that nobody is able to be clear about the Intervention start point (statistically speaking). What do you think ? How would you do ?
    But there comes the most challenging part: I have no clues whatsoever how to perform an Intervention Detection…

I could really do with some piece of advice !

PS. I also meant to say that I am happy to provide the data if it can help with the discussion.


Update

So here is the data

Here is my data, if that can help.
Ideally I would stick to the ARIMA (or other, best-fitting time-series model) because I not comfortable at all with this field and this is something I am currently capable of doing (don't know anything about VAR or Granger causality).

Moreover, I have only collected the sales from Jun-13 onwards, as I had fixed myself a 6-month limit for the "sale sbefore", but I can have figures back to 2011… I don't know however how relevant it is to include them in the analysis – and if that couldn't "dilute" the impact of the intervention ?


Update 2

Data from 2011 onwards

    Date    Sales_nat   Sales_affected
    Jan-11  13535.04614 10564.2
    Feb-11  12255.18701 6338.52
    Mar-11  15504.88513 16902.72
    Apr-11  13259.76914 14085.6
    May-11  15967.85091 13381.32
    Jun-11  15351.15898 9859.92
    Jul-11  16001.81365 16902.72
    Aug-11  20151.51071 23692.09
    Sep-11  21533.29437 30507.47
    Oct-11  21122.32893 19242.99
    Nov-11  22350.66487 25579.51
    Dec-11  21707.95193 15019.31
    Jan-12  23225.30391 28394.63
    Feb-12  22782.53005 23466.67
    Mar-12  24346.6397  30030.61
    Apr-12  23093.62005 21361.83
    May-12  26336.53924 22530.96
    Jun-12  22695.90797 18770.13
    Jul-12  26825.00843 21824.68
    Aug-12  26202.68137 23225.24
    Sep-12  24917.01741 23929.52
    Oct-12  30170.13777 20649.55
    Nov-12  28223.8397  27215.49
    Dec-12  28165.34954 19713.84
    Jan-13  30716.72604 20182.69
    Feb-13  26684.0364  28867.48
    Mar-13  28277.98102 15019.31
    Apr-13  30858.22655 28159.2
    May-13  31845.99066 24871.22
    Jun-13  29444.00066 31425.17
    Jul-13  34896.64914 41989.37
    Aug-13  30925.31929 23945.52
    Sep-13  32861.02081 25577.51
    Oct-13  34976.22452 10795.63
    Nov-13  32547.14685 31668.6
    Dec-13  38381.84523 35435.43
    Jan-14  40211.59741 4221.68
    Feb-14  31925.53772 29569.76
    Mar-14  37865.2967  49251.59
    Apr-14  39391.11061 28865.48
    May-14  35614.36797 7042.8
    Jun-14  41398.94482 37316.84

Best Answer

There are a few ways I would go about answering this question. In my opinion I would hold off using the ARIMA model; this is used more in forecasting than determining how one variable affects another variable.

From here I would start off simply by conduction some correlations to see how the data reacts to one another. You can utilize dummy variables for the controlled sales e.g., if you wanted to see how stock market returns perform during a specific month you would assign a 1 for a specific month and 0s for the remaining.

More specifically for your problem, I would use a Vector Autoregression (VAR) and test for Granger Causality. This allows you too see how variable "x" affected variable "y" i.e., does "x" cause "y" to move. I am not familiar with SAS so I cannot provide an exact code example.

I know this does not completely answer your question but hopefully it will get you moving in the right direction; allowing you to move forward.