Solved – Pre-processing before digit recognition for NN & CNN trained with MNIST dataset

MATLABpattern recognitionpython

I'm trying to classify handwriting digits, written by myself and a few friends, by usign NN and CNN. In order to train the NN, MNIST dataset is used. The problem is the NN trained with MNIST dataset does not give satisfying test results on my dataset. I've used some libraries on Python and MATLAB with different settings as listed below.

On Python I've used this code with setting;

  • 3-layers NN with # of inputs = 784, # of hidden neurons = 30, # of outputs = 10
  • Cost function = cross entropy
  • of Epochs = 30

  • Batch size = 10
  • Learning rate = 0.5

it is trained with MNIST training set, and test results are as follows:

test result on MNIST = 96%
test result on my own dataset = 80%

On MATLAB I've used deep learning toolbox with various setting, normalization included, similar to above and best accuracy of NN is around 75%.Both NN and CNN are used on MATLAB.

I've tried to resemble my own dataset to MNIST. The results above collected from pre-processed dataset. Here is the pre-processes applied to my dataset:

  • Each digit is cropped separately and resized to 28 x 28 by usign bicubic interpolation
  • Pathces are centered with the mean values in MNIST by usign bounding box on MATLAB
  • Background is 0 and highest pixel value is 1 as in MNIST

I couldn't know what to do more. There are still some differences like contrast etc., but contrast enhancement trials couldn't increase the accuracy.

Here is some digits from MNIST and my own dataset to compare them visually.

MNIST digits

my own dataset

As you may see, there is a clear contrast difference. I think the accuracy problem is beacause of the lack of similarity between MNIST and my own dataset. How can I handle this issue?

There is a similar question in here, but his dataset is collection of printed digits, not like mine.

Edit:
I've also tested binarized verison of my own dataset on NN trained with binarized MNIST and default MNIST. Binarization threshold is 0.05.

Here is an example image in matrix form from MNIST dataset and my own dataset, respectively. Both of them are 5.

MNIST

 Columns 1 through 10

             0         0         0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0    0.1176    0.1412
             0         0         0         0         0         0         0    0.1922    0.9333    0.9922
             0         0         0         0         0         0         0    0.0706    0.8588    0.9922
             0         0         0         0         0         0         0         0    0.3137    0.6118
             0         0         0         0         0         0         0         0         0    0.0549
             0         0         0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0    0.0902    0.2588
             0         0         0         0         0         0    0.0706    0.6706    0.8588    0.9922
             0         0         0         0    0.2157    0.6745    0.8863    0.9922    0.9922    0.9922
             0         0         0         0    0.5333    0.9922    0.9922    0.9922    0.8314    0.5294
             0         0         0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0         0         0

      Columns 11 through 20

             0         0         0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0         0         0
             0         0    0.0118    0.0706    0.0706    0.0706    0.4941    0.5333    0.6863    0.1020
        0.3686    0.6039    0.6667    0.9922    0.9922    0.9922    0.9922    0.9922    0.8824    0.6745
        0.9922    0.9922    0.9922    0.9922    0.9922    0.9922    0.9922    0.9843    0.3647    0.3216
        0.9922    0.9922    0.9922    0.9922    0.7765    0.7137    0.9686    0.9451         0         0
        0.4196    0.9922    0.9922    0.8039    0.0431         0    0.1686    0.6039         0         0
        0.0039    0.6039    0.9922    0.3529         0         0         0         0         0         0
             0    0.5451    0.9922    0.7451    0.0078         0         0         0         0         0
             0    0.0431    0.7451    0.9922    0.2745         0         0         0         0         0
             0         0    0.1373    0.9451    0.8824    0.6275    0.4235    0.0039         0         0
             0         0         0    0.3176    0.9412    0.9922    0.9922    0.4667    0.0980         0
             0         0         0         0    0.1765    0.7294    0.9922    0.9922    0.5882    0.1059
             0         0         0         0         0    0.0627    0.3647    0.9882    0.9922    0.7333
             0         0         0         0         0         0         0    0.9765    0.9922    0.9765
             0         0         0         0    0.1804    0.5098    0.7176    0.9922    0.9922    0.8118
             0         0    0.1529    0.5804    0.8980    0.9922    0.9922    0.9922    0.9804    0.7137
        0.0941    0.4471    0.8667    0.9922    0.9922    0.9922    0.9922    0.7882    0.3059         0
        0.8353    0.9922    0.9922    0.9922    0.9922    0.7765    0.3176    0.0078         0         0
        0.9922    0.9922    0.9922    0.7647    0.3137    0.0353         0         0         0         0
        0.9922    0.9569    0.5216    0.0431         0         0         0         0         0         0
        0.5176    0.0627         0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0         0         0

      Columns 21 through 28

             0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0
        0.6510    1.0000    0.9686    0.4980         0         0         0         0
        0.9922    0.9490    0.7647    0.2510         0         0         0         0
        0.3216    0.2196    0.1529         0         0         0         0         0
             0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0
        0.2510         0         0         0         0         0         0         0
        0.0078         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0

My own dataset

Columns 1 through 10

             0         0         0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0    0.4000    0.5569
             0         0         0         0         0         0         0         0    0.9961    0.9922
             0         0         0         0         0         0         0         0    0.6745    0.9882
             0         0         0         0         0         0         0         0    0.0824    0.8745
             0         0         0         0         0         0         0         0         0    0.4784
             0         0         0         0         0         0         0         0         0    0.4824
             0         0         0         0         0         0         0         0    0.0824    0.8745
             0         0         0         0         0         0         0    0.0824    0.8392    0.9922
             0         0         0         0         0         0         0    0.2392    0.9922    0.6706
             0         0         0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0    0.4431    0.3608
             0         0         0         0         0         0         0    0.3216    0.9922    0.5922
             0         0         0         0         0         0         0    0.3216    1.0000    0.9922
             0         0         0         0         0         0         0         0    0.2784    0.5922
             0         0         0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0         0         0

      Columns 11 through 20

             0         0         0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0         0         0
             0         0    0.2000    0.5176    0.8392    0.9922    0.9961    0.9922    0.7961    0.6353
        0.7961    0.7961    0.9922    0.9882    0.9922    0.9882    0.5922    0.2745         0         0
        0.9569    0.7961    0.5569    0.4000    0.3216         0         0         0         0         0
        0.7961         0         0         0         0         0         0         0         0         0
        0.9176    0.1176         0         0         0         0         0         0         0         0
        0.9922    0.1961         0         0         0         0         0         0         0         0
        0.9961    0.3569    0.2000    0.2000    0.2000    0.0392         0         0         0         0
        0.9922    0.9882    0.9922    0.9882    0.9922    0.6745    0.3216         0         0         0
        0.7961    0.6353    0.4000    0.4000    0.7961    0.8745    0.9961    0.9922    0.2000    0.0392
             0         0         0         0         0    0.0784    0.4392    0.7529    0.9922    0.8314
             0         0         0         0         0         0         0         0    0.4000    0.7961
             0         0         0         0         0         0         0         0         0    0.0784
             0         0         0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0         0         0
             0         0         0         0         0         0    0.0824    0.4000    0.4000    0.7176
        0.9176    0.5961    0.6000    0.7569    0.6784    0.9922    0.9961    0.9922    0.9961    0.8353
        0.5922    0.9098    0.9922    0.8314    0.7529    0.5922    0.5137    0.1961    0.1961    0.0392
             0         0         0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0         0         0

      Columns 21 through 28

             0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0
        0.1608         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0
        0.1608         0         0         0         0         0         0         0
        0.9176    0.2000         0         0         0         0         0         0
        0.8353    0.9098    0.3216         0         0         0         0         0
        0.2431    0.7961    0.9176    0.4392         0         0         0         0
             0    0.0784    0.8353    0.9882         0         0         0         0
             0         0    0.6000    0.9922         0         0         0         0
             0    0.1608    0.9137    0.8314         0         0         0         0
        0.1216    0.6784    0.9569    0.1569         0         0         0         0
        0.9137    0.8314    0.3176         0         0         0         0         0
        0.5569    0.0784         0         0         0         0         0         0
             0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0
             0         0         0         0         0         0         0         0

Best Answer

Have you centered the images from your dataset by center of mass? Original MNIST dataset use it as a preprocessing step.

Related Question