Solved – Using Regression Trees for Univariate Time Series Data

cartregressiontime series

I have a monthly time series (105 observations) including trend and seasonality and want to forecast the numeric values.

I initially tested with the Box-Jenkins approach and other univariate models like Facebook Prophet.

Now I want to extend to multivariate models and therefore I implemented a Regression Tree (scikit learn – Decision Tree Regression).
I split up my dataset in train/test data (89:16 observations). The most recent data of the time series is the test part.

To my surprise this Regression Tree worked very well on my test data, without extending it with other features. My dataframe consisted only of the time series and the index number.

My questions: Why does this Regression Tree works so well only with the time series as the data input? Is there an autoregressive component included like in the Box-Jenkins models? I thought this model class requires other features as input. Or is the index alone a valuable input for the Regression Tree?

Best Answer

I solved my issue: If you train a CART tree only with the time series data (univariate) and validate the model with the time series test part (also univariate), you will get a pretty low error rate.

The problem is, that you need independent variables (and not the time series target value itself) for further forecasting as an input. Otherwise you try to forecast data on data you don't have. That doesn't work.

Related Question