Solved – Batch Normalization decreasing model accuracy

batch normalizationconv-neural-networkdeep learningmachine learningneural networks

I’m a little confused. I have seen that Batch normalization leads to faster convergence and increased accuracy.
But the opposite is happening in my case. By normalizing, my accuracy actually decreased.
Is there something that I am missing?

Following is the code that I’m using

model = Sequential()
model.add(Bidirectional(LSTM(64 ,return_sequences=True),input_shape=(X_train.shape[1],X_train.shape[2])))

model.add(Conv1D(filters=16, kernel_size=3, padding=‘same’))
model.add(BatchNormalization())
model.add(Activation(‘relu’))
model.add(MaxPooling1D(pool_size=2))

model.add(Conv1D(filters=32, kernel_size=3, padding=‘same’))
    model.add(BatchNormalization())
model.add(Activation(‘relu’))
model.add(MaxPooling1D(pool_size=2))

model.add(Conv1D(filters=64, kernel_size=3, padding=‘same’))
model.add(BatchNormalization())
model.add(Activation(‘relu’))
model.add(MaxPooling1D(pool_size=2))

model.add(Dropout(0.3))
model.add(Flatten())

model.add(Dense(150))
model.add(BatchNormalization())
model.add(Activation(‘relu’))
model.add(Dropout(0.4))

model.add(Dense(10))
model.add(BatchNormalization())
model.add(Activation(‘relu’))
model.add(Dropout(0.4))

model.add(Dense(dummy_y.shape[1],activation = ‘softmax’))

model.compile(loss=‘categorical_crossentropy’, optimizer=‘adam’, metrics=[‘categorical_accuracy’])
model.summary()

Is it the right place for batch normalization or I’ve done something wrong ?

Best Answer

Try putting your Batch Normalization layer AFTER activation. Because what it's doing right now is effectively killing off half of your gradient on each layer - you normalize to 0 mean, which means only half of your ReLUs are firing, and you get vanishing gradient.