In general,you cannot learn in batch sequences (search neural forgetting). You must have representatives of the complete characterization of all data at all times. The results will be good only if all the batches are statistically equivalent. However, if they are statistically equivalent, you only need one!
However, you could design multiple nets on separate random subsets and then average the results. This configuration is called an ensemble if you wish to search for details.
Hope this helps.
Thank you for formally accepting my answer
Greg
Best Answer