Solved – Neural network training on unlimited theoretical data

financemachine learningneural networks

I'm considering using a neural network on financial time series but rather than train the network on actual data I am going to train on a model of the data which is perturbed by random noise. This being the case, I will potentially have unlimited amounts of model training data. However I don't want to generate a huge amount of data and then train the model as it might take a very long time to reach a solution, and I have no idea how much data to actually generate.

What I am thinking of doing is training on a small amount of model data (say 5000 examples) and get the values for the hidden nodes, record these values, and then repeat again, thereby building up a distribution of values for each node. These distributions could then be bootstrapped to get the mean value per node, and the whole process would stop once the change in the bootstrapped mean values falls below a given threshold.

Edit – the purpose of the network will be to classify/label the time series over the recent past as being in one of a finite number of states, e.g. trending up/down, moving sideways, in congestion etc. These states can be modelled using synthetic data with known labels, and then on real data the network's job will be to identify which state the real data most closely resembles. This will be used as input to a separate decision making process.

My question is – is there any reason why this would not be a valid approach to take?

Best Answer

You are not the first person who does something like this. Researches have done this for image recognition for about 15 to 20 years. An example for this is the MNIST data set of handwritten digits. The data set usually consists of 60,000 training examples. But this number is not sufficient to reach >99,5 % accuracy with multilayer perceptrons. So, people generate more training examples with distortions in each iteration of the optimization algorithm. The algorithm they usually use to train the neural networks is called stochastic gradient descent (or online learning in comparison to batch learning). There exist variants like "stochastic diagonal Levenberg-Marquardt" that require an approximation of the Hessian. The averaging of weights could produce a classifier that is really bad.

Related Question