Solved – Keras: CNN multiclass classifier

conv-neural-networkkerasneural networkspythontensorflow

After starting with the official binary classification example of Keras (see here), I'm implementing a multiclass classifier with Tensorflow as backend.
In this example, there are two classes (dog/cat), I've now 50 classes, and the data is stored the same way in folders.

When training, the loss won't go down and the accuracy won't go up.
I've changed the last layer which used a sigmoid function to use the softmax, changed binary_crossentropy to categorical_crossentropy, and changed the class_mode to categorical.

Here is my code:

from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D
from keras.layers import Activation, Dropout, Flatten, Dense
from keras import backend as K
import keras.optimizers



optimizer = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)

# dimensions of our images.
img_width, img_height = 224, 224

train_data_dir = 'images/train'
validation_data_dir = 'images/val'
nb_train_samples = 209222
nb_validation_samples = 40000
epochs = 50
batch_size = 16

if K.image_data_format() == 'channels_first':
    input_shape = (3, img_width, img_height)
else:
    input_shape = (img_width, img_height, 3)

model = Sequential()
model.add(Conv2D(32, (3, 3), input_shape=input_shape))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(32, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(64, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Flatten())
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(50))
model.add(Activation('softmax'))



model.compile(loss='categorical_crossentropy', optimizer=optimizer, metrics=['accuracy'])


train_datagen = ImageDataGenerator()

train_generator = train_datagen.flow_from_directory(
    directory=train_data_dir,
    target_size=(img_width, img_height),
    batch_size=batch_size,
    class_mode='categorical')

validation_generator = train_datagen.flow_from_directory(
    directory=validation_data_dir,
    target_size=(img_width, img_height),
    batch_size=batch_size,
    class_mode='categorical')

model.fit_generator(
    train_generator,
    steps_per_epoch=nb_train_samples // batch_size,
    epochs=epochs,
    validation_data=validation_generator,
    validation_steps=nb_validation_samples // batch_size)

model.save_weights('weights.h5')

Any idea on where I might be wrong ?
Any input will be much appreciated !

Best Answer

The first thing to do if your NN is not converging is to repeatedly reduce the learning rate. It's the most important hyperparameter.

Divide the LR parameter by 10, try again, rinse repeat. You might find, for example, it needs to be 10000 times smaller, before you stop bouncing around and actually start descending the gradient of your loss surface.

Adam is often regarded as the best "out of the box" optimiser, you might want to start with that instead of SGD. opt = keras.optimizers.Adam(lr=nnParams['lr'], beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)

Related Question