Solved – Prediction using CRFs for time series data

predictiontime series

I have a little confusion about validity of some predictions I am making using a CRF model I have trained.

The CRF model is trained on some input time-series, and when making predictions, I am passing the entire sequence to be labelled to the model. I have a concern that this is incorrect for the problem, since what I want is predictions for each observation in the time series up to that point (as if it were coming in, in real time), and I have the feeling that predictions are being made using the entire sequence (mainly due to references I see about a forward-backward algorithm for inference).

Do I need to pass only the sequence up to "what has currently been observed" in order to get a causally predicted label for that point in time, or is there such a thing as "forward only" prediction for such scenarios?

Best Answer

You want to do time-series prediction, but the CRFs are used for sequential supervised learning.

There are two key differences between time-series prediction and sequential supervised learning.

First in sequential supervised learning, the entire sequence x1,..., xT is available before we make any predictions of the y values, whereas in time-series prediction, we have only a prefix of the sequence up to the current time t + 1.

Second, in time-series analysis, we have the true observed y values up to time t, whereas in sequential supervised learning, we are not given any y values and we must predict them all.

taken from Machine Learning for Sequential Data: A Review

Related Solutions

Solved – Hourly predictions using time series

Well, the difference is... that they are different methods. ("Can any one explain the difference between apples and oranges?")

ARIMA models are explained in any introductory time series book. (I'll never tire of recommending this free open source online forecasting textbook.) If you want to include weather info, you'd need ARIMA models with eXplanatory or eXternal information, or ARIMAX models. These are also standard.
Trees/CARTs/Random Forests are explained in any Data Science textbook, or even the Wikipedia pages. These will, of course, model explanatory variables "as-is". Your idea of using days, hours and months as features does make sense in this context. However, simply feeding independent dummies for "9-10am", "10-11am" and so forth into your model may or may not account for the fact that your observations in the 9-10am and the 10-11am time buckets will be more highly correlated than the ones in the 9-10am and the 1-2pm buckets.

A couple of random thoughts:

ARIMA(X) will have a hard time dealing with the multiple seasonalities involved (year-over-year, intra-week with people commuting to work Mon-Fri but not Sat/Sun, intra-day with more people biking during the day). You could in principle model these seasonalities using dummies in your ML models. Alternatively, there are a couple of approaches to multiple seasonalities in the context of Exponential Smoothing/State Space models.
Weather is of course highly correlated with time-of-year and time-of-day: it's hotter in summer and during the day than in winter and during the night. If you already model seasonality as above, you may find that adding weather information does not improve the forecasts very much beyond what seasonality already does.
If you want to forecast something using the weather, remember that you will need weather forecasts, too! Don't assess your out-of-sample forecasts based on how they work with actual weather - you won't know tomorrow's actual weather when you do "production" forecasting. The uncertainty in weather forecasts adds an additional source of uncertainty in your bicycling forecasts. In particular, weather forecasts are not very reliable for more than 15 days out, so they won't be very helpful for forecasting bike rides that far out. (Incidentally, getting historical weather data is far easier and cheaper than getting historical weather forecasts.)
You may want to look at the electricity price or load forecasting literature - that use case deals with many of your challenges (high frequency data, multiple seasonalities, weather influence). I haven't read this review yet, but it may be helpful.

Solved – Simple Neural Network for time series prediction

I'm going to take a stab at this and say it could be a problem with normalization boundaries.

I'm not familiar with the AForge.net NN library, but at some point your data should be normalized to fit between 0 and 1.

At some point, the normalization process detected 1 as the minimum value and 20 as the max value, and from those bounds, every value is converted to fit between 0 and 1. For example,

1  -> 1/20 = 0.05
...
19 -> 19/20 = 0.95
20 -> 20/20 = 1

When you exceed these bounds later, you're normalization no longer produces values between 0 and 1 and this really wrecks havoc on the network.

25 -> 25/20 = 1.25

What you could do is ensure your normalization factors in your true max and min bounds.

Best Answer

Related Solutions

Solved – Hourly predictions using time series

Solved – Simple Neural Network for time series prediction

Related Question