Solved – Data augmentation

conv-neural-networkdata augmentation

In many papers on CNN,I have read that data augmentation is carried out on a per epoch basis. My thoughts regarding this were that data augmentation is carried out prior to starting the training procedure on the train set. What I mean is that split the data into train and test, and then augment the train data. Is carrying out augmentation on a per epoch basis the recommended method, and are there any flaws/drawbacks regarding my understanding of the augmentation procedure?

Best Answer

Usually when training deep networks, it is common to use mini-batches for training. Data augmentation is applied on these mini-batches. This in-effect causes you to have almost infinite data as in each epoch a new example is seen by the network and thus enables the network to generalise well.

If you just augment the training data before starting the training, it could still be useful in increasing the size of the dataset ( for example if you combine augmented and non-augmented sets). But in this case, after each epoch the network would see exactly the same training examples as it saw in the previous epoch and so might not generalise as well as it would in the first scenario.

Refer to this paper http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf

Related Question