Solved – Could somebody explain to me what this ARIMA model output says

arimapredictive-modelsspss

I've asked a few questions here before regarding my thesis. Although I try my best to follow-up on your suggestions, my statistical knowledge is limited but I try my utmost. Adding a predictive model to your thesis is not required (since it isn't taught during the studies) but my thesis coach insists. So I've just let SPSS dictate the best-fitting ARIMA model for my thesis.

Basically, I have taken some internet data (hbVol0LN is number of tweets, hbBullQuality0 is the ratio for postive against negative tweets, etc.) for 100 companies over 103 days. Here, the dependent variable is the return of the stock of each of those 100 companies per day. I already performed an OLS (although it has been pointed out that this is not the ideal model for my research, it is accepted by my coach), but now I believe this ARIMA model should hold the predictive value of the data.

It is very hard to find annotated ARIMA output online, or a paper which describes the output in a way I can understand. Could you perhaps give me some insights of what this output is telling me? Any help at all is greatly appreciated.

If you're having a hard time reading the graph, here's the full-size one: http://i.stack.imgur.com/H42YP.png

Again, I cannot express how frustrating it is for somebody who has hardly had to do any statistics during his studies, having to produce a predictive model. Therefore, really, any help is appreciated.

enter image description here

Best Answer

My response to your other post How to perform pooled cross-sectional time series analysis? detailed how to deal with panel data. You have been put in a difficult position of having to explain the output of SPSS's expert modeler , which in my opinion is inadequate for your analytical needs. Using all of the data (10,300 observations) http://www.autobox.com/stack/pooled/dataset-irishstat.xls to identify an appropriate XARMAX model leads to model over-specification due to the "false sample size" since the daily readings/observations are not statistically independent of each other . I don't believe that the developers of expert modeler had your data set in mind. Additionally since no outliers (pulses/level shifts/seasonal pulses/local time trends are detected/incorporated i.e. unspecified deterministic structure there are additional uncertainties ( major questions ! ) about the final model.I should mention that I am a developer/writer of AUTOBOX which competes with SPSS, so my comments are not only expert but may be biased, but I hope not.SPSS attempts to conclude about 1) what differencing is appropriate for each series 2) what the delay is between the output and each of the input series 3) what the pdl/adl lag structure is in terms of both fixed and dynamic effects 4) what the appropriate ARIMA structure is ALL without any concern for unspecified deterministic structure. AUTOBOX actually details how the forecast can be decomposed to illustrate the impact of the predictor series. After a review of the model form/coefficients , I conclude that their results are questionable BUT I must reiterate that model identification is best done locally i.e. for each company separately and then tested for consistency across companies before concluding that a global estimate (your coefficents) have any meaning. One more thing the ma7 structure probably reflects a day-of-the-week effect which has been omitted but inadequately proxied.