When attempting to detect cross-correlation between two time series, the first thing you should do is make sure the time series are stationary (i.e. have a constant mean, variance, and autocorrelation).
The reason this is important is because a correlation is looking to measure a linear relationship between two variables. Presence of a time series trend interferes with gauging a true correlation between two time series variables, i.e. is it a true correlation or simply due to chance.
In this regard, firstly use the Dickey-Fuller test to screen for stationarity (it would help if you specify the software package you are using, I am using Python in this instance). Suppose you have two time series x and y:
xdf = ts.adfuller(x, 1)
ydf = ts.adfuller(y, 1)
Here's some sample output:
xdf
(-3.0704779047168596, 0.028816508715839483, 0, 106, {'1%': -3.4936021509366793, '5%': -2.8892174239808703, '10%': -2.58153320754717}, -723.247574137278)
ydf
(-2.949959856756157, 0.03983919029636401, 1, 105, {'1%': -3.4942202045135513, '5%': -2.889485291005291, '10%': -2.5816762131519275}, -815.3639322514784)
In this instance, we have p-values below 0.05, so the series do not need to be differenced for stationarity. In the case that we did, it would be necessary to difference the series. The following tutorial might help you.
Now, it is a matter of calculating the cross-correlation between x and y, and generating the lags:
# Calculate correlations
cc1 = np.correlate(x - x.mean(), y - y.mean())[0] # Remove means
cc1 /= (len(x) * x.std() * y.std()) #Normalise by number of points and product of standard deviations
cc2 = np.corrcoef(x, y)[0, 1]
print(cc1, cc2)
Upon obtaining the cross-correlation coefficient, the lags can be generated and the autocorrelations calculated:
# Generating lags
lg = 108
x = np.random.randn(lg)
Best Answer
The actual date /time/channel is an observation/transaction. A time series is a bucketing of transactions. For each type of advert,I would take the time of the advert and bucket them into hours to create a number of possibly "causal" time series. The impact of an advert may depend on the hour of the day or the day of the week or whether or not it is on a holiday or even “nearly” on a holiday.Since you only have 10 days it is not feasible to try and compute daily/holiday effects. The cross-correlation between each of these discrete advert time series can be computed (descriptive statistic) but shouldn’t be as your predictor varaibles are discrete counts and your subsription data may be autocorrelated. I would create 23 predictor series reflecting hour of the day and include these as well as the advert time series computed above into an ARMAX model.Care should be taken to identify and deal with any subscription readings that reflected either Pulses, Level Shifts, Time Trends or Seasonal Pulses as these would be assignable to omitted variables that you had not controlled for. Hope this helps.