I'm new to R, statistics and time series, so I would really appreciate any help, explanations or suggestion on good readings on the topic.
I have two time series representing the number of times a word, say "election", was used on Twitter and the number of times it was used on television, in a day for a month.
I want to find the correlation between the two, but honestly I really don't know what to do. I was going to directly use ccf()
in R, but reading some other question here, I don't think this would be correct.
Are there any requirements or necessary conditions on the series to find the correlation with ccf()
? Is it the right type of correlation coefficient? Which are the step I need to perform to make it work?
Best Answer
Your very straightforward simple question has unfortunately both a simple and a complex answer. I will avoid the simple . In summary the whole idea is that one needs to account for / condition for intra-correlation while identifying the inter-correlation . Following are some references that you might consider before attempting to proceed . The first is an easy iverview
ARIMAX model's exogenous components?
this reference provides info as to why you should be aware of simple soulutions that may be routinely available
http://empslocal.ex.ac.uk/people/staff/dbs202/cat/stats/corr.html
This outlines a general procedure which is far from general as it doesn't deal with Gaussian Violations.
https://onlinecourses.science.psu.edu/stat510/node/75
This provides a gentle overview of regression vs simple ARIMA time series methods.
http://www.autobox.com/cms/index.php/afs-university/intro-to-forecasting?start=5
Lastly, arm yourself with data and try different approaches .
Hope this helps ..