Solved – Neural network for time series forecasting- Single input Single output Theoretical proof needed

neural networkstime series

I am doing time series forecasting using neural networks. I have 2 approaches:

  1. Forecasting in a auto regressive manner i.e based on time series lags as shown below:

    y(t) = f(y(t-1), y(t-2), ..., y(t-d)) 
    
  2. Forecasting in linear regression manner i.e one independent variable and one dependent variable as shown below:

    y(t) = f(x(t))
    

In the first case, the neural network is multiple input single output, while in second case, the neural network is single input single output. The data which I am trying to forecast is wind energy production. So, in the first case, the values used are just power output. In the second case, the independent variable 'x' is wind and dependent variable 'y' is power output. In both cases, I am forecasting using sliding window method with short term horizon. In both models, there is a hidden layer in the neural network model.

I am getting low forecasting errors with the second method. I did not find any proof that neural networks can be used in this manner. So, I was hoping for some clarification from the experts here, is forecasting using neural networks correct?

Best Answer

Let me help here.

Key points:

  • wind is auto-regressive in space and time. The storm that is 100 miles away today can be here tomorrow. Today's wind here doesn't tell me as much about the storm tomorrow as today's wind closer to the storm. (upwind)
  • The tracks that storms take vary some in year-to-year so you have to have enough years. Upwind is not perfectly constant year-to-year.
  • seasons are different from each other. Summer wind is different than winter wind. This varies by location. There is some year-to-year variation.
  • the conventional notion of season is not supported by the data. In heating the US has 3 seasons, and in cooling it has 5. May is its own happy little season.
  • weather is complex - like Navier-Stokes meets the fusion of the sun, the terrain of the earth, and on the scale of a planet. If a simple NN or even if an inhuman but functional NN could make decent sense of it then it would. Weathermen are wrong because it is a hard problem.
  • there are measurement problems. The sensors are placed in bad locations, and can be questionably calibrated. The fluid is moving at different speeds and you can only get some approximation of the mean, but you don't get a measure of variation - which is important to the eddy dissipation. You can't measure 1% of the actual wind, so local generalization is tough.

To reduce error your model must take the "physics" into account.

If I were digging into this,

  • I would pull out all the NOAA weather data for every one of the major sites (~1200) for at least the last five years.
  • I would split by season, and I would let the data tell me how many there are. A good variability plot of hourly mean wind speed split by week of year for the last 5 years should tell you what the wind seasons look like.
  • I would use methods that look at effects of space and time.
  • I would split by geography - there are ~43 data-driven unique climate zones in the US, don't look at ASHRAE because they handle them like summer and winter are the same beastie and get ~7 major zones. I would split by climate zone.
  • I would also want to account for solar irradiance. The primary energy source for the earth is the sun. It might not be as much of a leading indicator, but I would want to take a look.
  • I might also look at it by solar time of day. Dawn/Dusk winds don't happen at noon.
  • If you deal in mean only then you are asking for trouble. account for variation. I would use a RF to find variable importance on many variables, then feed those into the NN. moments, moments of truncated (internal and external) distributions, percentiles, these are all candidates.
  • I always want to scale and then center both my inputs and outputs. If I am not having a "crazy" day then I detrend too. Why waste CPU in the NN trying to determine what a simple GLM could do? Why not make it deal with the really hard stuff?

After you have done that, in order, then your MLP-NN or RBF-NN or SVM should be able to handle prediction with substantially better results.

I don't know that you have properly preprocessed your data for this particular problem. If you feed dirty data in, don't expect clean model predictions out.

There is a particular test used for evaluating the algorithmic performance of things like ensemble kalman filter or 4DVAR. I forget the name, but it assumes there is like 1.5 or 2.5 dimensional chaotic attractor. I will try to dig it up. NN's work on it too, so this test gives a clean bridge to map NN's to weather forecasting. I forget the name.

Here is a reference about using MLP in forecasting (like weather). And another. You might read this reference to help you think through the number of interior nodes.

Related Question