Solved – Order of normalization / augmentation for image classification

deep learningmachine learning

I'm currently working on a common image classification with CNN.

I would like to use both normalization (substract mean / divide by std per channel) and data augmentation (rotation, color, blur, …) but I don't know how to use them together.

Which order should I use ?

  • First normalize with parameters based only on original images and then augment it (augment a normalized image is relevant ? should I ban some type of augmentation like color ?)

  • Augment data and apply normalization based on all image (compute mean/ std with augmented images) which seems to be counterintuitive.

  • Augment data and apply normalization based on only original image which means that data are not really normalized

  • Or don't use both methods

Best Answer

You should use both; each of the two methods aims to solve a different problem, so using only one would not solve the problem solved by the other.

It does not matter so much in which order you perform them. Doing augmentation first and normalization second sounds more natural to me (you can visually validate if your augmentation is generating sensible data), but the other way round is also possible, you just need to make sure that the augmentation parameters respect the normalization (e.g. shifting a color by value from uniform distribution [-20,20] makes no sense for images on scale [-1,1]).

Related Question