Solved – Data Augmentation in Keras: How many training observations do I end up with

conv-neural-networkdeep learningkerasmachine learning

I'm reading through Francois Chollet's "Deep Learning with Python" and was recently introduced to a concept I had never encountered before in my statistics studies. Namely, data augmentation. I have a question about what the following code does (appearing on pg 141 of the book):

train_datagen = ImageDataGenerator(rescale=1./255,
                                   rotation_range=40,
                                   width_shift_range=0.2,
                                   height_shift_range=0.2,
                                   shear_range=0.2,
                                   zoom_range=0.2,
                                   horizontal_flip=True,)

test_datagen = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow_from_directory(train_dir,
                                                    target_size = (150,150)
                                                    batch_size = 32,
                                                    class_mode='binary')

history = model.fit_generator(
     train_generator,
     steps_per_epoch=100,
     epochs=100,
     validation_data=validation_generator,
     validation_steps=50)

What I want to know is how the ImageDataGenerator() is working. E.g., if I have a training directory with 2000 images, will the data augmentation create more than 2000 observations to train with? How do I know/control how many observations are developed?

Thank you in advance.

Best Answer

Data augmentation is used to artificially increase the number of samples in the training set (because small datasets are more vulnerable to over-fitting).

Keras is using an online data-augmentation process, where every single image is augmented at the start of every epoch (they are probably processed in batches, but the point is that it happens ones per epoch).

With the exemption of the horizontal flip (which doubles the number of samples), all the remaining augmentation techniques consist of a range of possible operations. If you want to treat this range as continues, you end up with an infinite amount of samples. Obviously each individual process is actually discrete so you can calculate the final number of possible samples as the product of the individual affects of all techniques:

enter image description here

For example the width shift is discreetly limited by the number of pixels in the x axis of the images, So for a value of 0.2, each image can be shifted up to 0.2*number of pixels in the x axis, and the shift can be either to the left or to the right (so we double the number). * There is a chance that Keras also performs some sub-pixel shifts (it should be in their documentation), but it just means that this number needs to be further multiplied by some factor.

You can perform this calculation for every single operation with the aid of the source code from Keras: Image Preprocessing source code