Solved – Forecasting Sharp Peaks in a Time Series using Convolutional Neural Networks

arimaconv-neural-networkneural networkstime series

I am having with me a time series data of a variable called Differential Pressure for some section in a natural gas refinery unit. Occurrences of sharp peaks in the variable value can possibly mean the onset of an undesired event in the unit. These events can be dealt with by the plant operators but only if they have been warned about its occurrence for some time in advance.

We thought of using a deep convolutional neural network to predict the values of this variable ahead in time using the methodology described in the paper here. We achieved decent results. In the figure below, I show the actual value of differential pressure from the plant in blue and the predicted values using CNN in orange. enter image description here
In here, you can see the zoomed in section around a peak.
enter image description here
As you can see, apart from the sharp peaks, we are able to predict the values reasonably well. However, as mentioned earlier, to capture these peaks is our main interest here.

I am sure this problem would have been faced by many who has worked on time series predictions using machine learning. I would be keen to know if someone has worked on a similar problem before and was able to approximate the sharp peaks. Any suggestion or reference towards a source for reading to get around this problem would also be greatly appreciated.Thanks.

Best Answer

Here are a couple ideas to try:

(1) LSTMs (and recurrent neural nets in general) are often useful for making predictions from sequential data, since these models can accumulate historical data and use it to make predictions. Here's an example of LSTMs being used for time series prediction.

(2) Choose loss functions which don't penalize large errors heavily. If large errors are penalized heavily (for instance, with L2 distance as the loss), your model will make conservative predictions. The paper you linked uses MAE loss, which is the average L1 and works better, but if you aren't already using that loss it might help.

(3) Make sure you're not overfitting. This doesn't strictly answer your question, but in the graphs above your model tracks the seemingly random noise in the graph above very well, although it's possible your CNN is picking up on patterns too fine to pick out with the human eye. If the plot you showed was from test data your model is fine, but if it's from train data check for overfitting.

(4) If you provide more data (how many points you are trying to predict at a time, using what window of input data) we might be able to provide more useful advice.

(Some of the points in this answer would be in a comment but I don't have the rep to post one.)

Related Question