First, is your subsetting statement mistyped? It appears you mean something like:
data.s<-data[1:528,]
data.s.g<-data.s[,1]
You might even want to show us a sample of your data (dput
), which would let us process it to get an answer more like what you're expecting -- though not using an ARIMA(1,1,1) model.
Second, it looks like you might be training your VAR on the entire data and then predicting the last part, while training your ARIMA and SS on only the first part of the data? (In addition to which, VAR has two time series to work with.)
Third, you're expecting too much of your ARIMA. (If you look into the internals of the Arima
object returned by auto.arima
, you can find the state space model that R uses under the hood: arima.m$model
.) An AR(1) uses only the current data point to make its next prediction, which is not much information.
auto.arima
isn't magic. It knows nothing about your data and looks through a limited window of options. If you know more, like perhaps the data has a natural 100-period cycle, you can add that and get much better results.
Fourth, be careful that you've got your dlm
model wired together correctly. It seems like there may be one more state than you think there is.
EDIT: Now that you've posted your data, it looks a lot like stock prices, which you're not going to predict with any canned methods.
Here is some preliminary list of disadvantages I was able to extract from your comments. Criticism and additions are very welcome!
Overall - compared to ARIMA, state-space models allow you to model more complex processes, have interpretable structure and easily handle data irregularities; but for this you pay with increased complexity of a model, harder calibration, less community knowledge.
- ARIMA is a universal approximator - you don't care what is the true model behind your data and you use universal ARIMA diagnostic and fitting tools to approximate this model. It is like a polynomial curve fitting - you don't care what is the true function, you always can approximate it with a polynomial of some degree.
- State-space models naturally require you to write-down some reasonable model for your process (which is good - you use your prior knowledge of your process to improve estimates). Of course, if you don't have any idea of your process, you always can use some universal state-space model also - e.g. represent ARIMA in a state-space form. But then ARIMA in its original form has more parsimonious formulation - without introducing unnecessary hidden states.
- Because there is such a great variety of state-space models formulations (much richer than class of ARIMA models), behavior of all these potential models is not well studied and if the model you formulated is complicated - it's hard to say how it will behave under different circumstances. Of course, if your state-space model is simple or composed of interpretable components, there is no such problem. But ARIMA is always the same well studied ARIMA so it should be easier to anticipate its behavior even if you use it to approximate some complex process.
- Because state-space allows you directly and exactly model complex/nonlinear models, then for these complex/nonlinear models you may have problems with stability of filtering/prediction (EKF/UKF divergence, particle filter degradation). You may also have problems with calibrating complicated-model's parameters - it's a computationally-hard optimization problem. ARIMA is simple, has less parameters (1 noise source instead of 2 noise sources, no hidden variables) so its calibration is simpler.
- For state-space there is less community knowledge and software in statistical community than for ARIMA.
Best Answer
I did not see your question before.
Yes, dynamic factor analysis can bee seen as a particular case of state-space model. It makes observations dependent of a small dimensional state vector (small relative to the dimension of the observation vector). So it is the same idea as in ordinary factor analysis, plus time dependence.
The "factors" may have any time dynamics. Several R packages, if you use R, will let you specify a general dynamic factor analysis model, including for instance
dlm
orKFAS
.