Solved – Validation accuracy reach to 1.000 in very first epoch

conv-neural-networkkerasvalidation

I am using below small 3D CNN to predict whether 32*32*32 image cube in a CT scan is malignant or not.

def classifier(input_shape, kernel_size, pool_size):
    model = Sequential()

    model.add(Convolution3D(16, kernel_size[0], kernel_size[1], kernel_size[2],
                            border_mode='valid',
                            input_shape=input_shape))
    model.add(Activation('relu'))
    model.add(MaxPooling3D(pool_size=pool_size))
    model.add(Convolution3D(32, kernel_size[0], kernel_size[1], kernel_size[2]))
    model.add(Activation('relu'))
    model.add(MaxPooling3D(pool_size=pool_size))
    model.add(Convolution3D(64, kernel_size[0], kernel_size[1], kernel_size[2]))
    model.add(Activation('relu'))
    model.add(MaxPooling3D(pool_size=pool_size))
    model.add(Dropout(0.25))

    model.add(Flatten())
    model.add(Dense(512))
    model.add(Activation('relu'))
    model.add(Dropout(0.5))
    model.add(Dense(128))
    model.add(Activation('relu'))
    model.add(Dropout(0.5))
    model.add(Dense(2))
    model.add(Activation('softmax'))

    return model

Compilation is as below

input_shape = (1,32,32,32)
model = classifier(input_shape, (3, 3, 3), (2, 2, 2))
model.compile(loss='categorical_crossentropy',
      optimizer=Adam(lr=1.0e-7),
      metrics=['accuracy'])

But when training from 2nd epoch the validation accuracy reach to 1.0000.
as in below graphs. I tried with several learning rates (which sudden drops indicates restarting the training with reduced learning rates).
enter image description here

I am using 900 image cube, label pairs as the training data set and 283 cube,label pairs as the validation set. Each cube is shifted 48 ways to increase the data set. I did this manually because Keras ImageDataGenerator do not support for 3D data augmentation. Therefore total no of train data =43116(~900*48) and total no of test data =13584 (~283*48). Data was standardized before fed to the network. My question is, is this due to this architecture doesn't suit for my classification problem or is this due to fewer no of data samples. Or is this is a indication of over-fitting? Can you please help me to figure out what is going wrong here.

Best Answer

Possible reasons include small number of samples in validation data, easy dataset(very unlikely to happen here) and sometimes unshuffled training data. You could fix the problem by increasing your training data samples or augmenting your training data in order to create enough samples for the model to learn/validate.

Data augmentation docs

Related Question