Solved – Classification (regression) with rolling window for time series-type data

classificationcross-validationmachine learningpythontime series

This is rather a conceptual question, than technical. I am interested in performing a rolling (sliding) window analysis, where I aim to predict a label ('0' or '1') of the next value of my time-series. For example, consider the time-series data and the array of labels:
(I work with Python and sklearn)

ts  = array([11, 15,  3, 18,  6, 10,  9, 25,  7, 15])
lab = array([ 0,  0,  0,  1,  1,  0,  0,  1,  0,  1])

What I am doing is trying to learn a function 'F', which maps the input features (extracted from the window from the 'ts' array) to the binary labels 'lab': F(feat(ts)) -> lab. Conceptually, this problem is equivalent to the labeling of windows of size-k conditioned on the next timestamp.

F(ts[i:i+3]) -> lab[i+3]

Practically: Consider a window of size 3, then we get:

F([11,15, 3]) -> y = 1
F([15, 3,18]) -> y = 1
F([ 3,18, 6]) -> y = 0
F([18, 6,10]) -> y = 0
F([ 6,10, 9]) -> y = 1
F([10, 9,25]) -> y = 0
F([ 9,25, 7]) -> y = 1

(the very last value is unused). The real time-series is much longer.

QUESTION: For training, testing and cross-validation, may I (pretend and) use my instances as i.i.d? What I mean is: can I randomly divide the instances to training, validation and test sets?

Of course they are not i.i.d., but when I naively tried to process my data and to learn a classifier (simple logistic regression), it surprisingly worked very well and I got quite reasonable results for the classification metrics.

EDIT

I'm reading the paper A Note on the Validity of Cross-Validation for Evaluating Time Series Prediction of and authors clearly state, that:
"….and CV can and should be used without modification, as in the independent case.."

Best Answer

How you divide your data set into training/test depends on the data you have available and how your model will be used. Ideally you wouldn't randomly separate the time-points, since as you say, they are not independent if there is any temporal signal at all.

If you have multiple time-series then I'd divide the time-series themselves into training and test in whatever fashion you want.

If your training data is a single time-series and you intend to predict future values of this time-series then I'd segment it accordingly. I.e. use the first 60% of the samples as your training data and the remaining 40% as your test. Of course, these sets aren't independent but given the nature of your data this is unavoidable.

If you have a single time-series for training but the actual time-series that you want to predict future values for is entirely separate then I'd still follow the procedure from the above paragraph, but bear in mind that any estimates of model fit you derive are very likely to be inaccurate.

As an aside, I would be tempted to use a Recurrent Neural Network approach to a problem like this. This would allow you to model the temporal aspect elegantly - something like an an LSTM can maintain a memory of previous values without having to explicitly specify a window size. Of course, if you wish to use a window approach then you could in theory use any classification algorithm you want.

EDIT

That technical paper looks to have covered the issue in far more depth than my answer so I'd follow their recommendations instead. The main difference from your use case is that they are dealing with standard forecasting the next sample, rather than classification