MATLAB: Increase GPU Throughput During training

Deep Learning Toolboxgpumachine learningneural networkneural networks

I have a single Tesla GP100 GPU with 16GB of RAM. When I'm training my neural network, I have two issues

Using a imageDatastore spends a HUGE amount of time doing an fread (I'm using a custom ReadFcn because my data is asymmetric and that seemed easiest). I am able to overcome this by reading all the data into memory prior to training but that will not scale.
During training I am only using 2.2GB of the 16GB available on the GPU. When I use the exact same network and data with TensorFlow, I use all 16GB. This is the case even if I preload all the data above into memory. I'm guessing that is because TensorFlow is "queuing up" batches and MATLAB is not. Is there a way to increase this?

Here is my minimum example code:

 function net = run_training_public(dims, nbatch, lr, nepoch)
    % Load Data
    ds = imageDatastore('./data/set3', 'IncludeSubfolders',true,...
                        'ReadFcn',@(x)reader_public(x,dims),...
                        'LabelSource','foldernames',...
                        'FileExtensions','.dat');
    % load neural network structure
    network = cnn1;
    % Setup options for training and execute training
    options = trainingOptions('adam','MaxEpochs',nepoch,'MiniBatchSize',...
                              nbatch,'Shuffle','every-epoch',...
                              'InitialLearnRate',lr,...
                              'ExecutionEnvironment','gpu','Verbose',true);
    net = trainNetwork(ds,network,options);
 end
 function data = reader_public(fileName, dims)
    f=fopen(fileName,'r');
    data  = fread(f,[dims(2) dims(1)],'*int16').';
    fclose(f);   
 end

Best Answer

I solved my problem with help from Joss. I had to create a custom image format via the imformats function:

https://www.mathworks.com/help/matlab/ref/imformats.html

I used my own binary file reader based on looking at the built in png reader functions and my code above. It is a huge speedup, and now I am able to eliminate the ReadFcn but still have my custom reader. The only issue I didn't solve was how to pass the dims variable into the reader instead of hard coding it.

 function net = run_training_public(nbatch, lr, nepoch)
    % Add custom image type to imread registry
    create_custom_image_format()
    % Load Data
    ds = imageDatastore('./data/set3','IncludeSubfolders',true,...
                        'LabelSource','foldernames',...
                        'FileExtensions','.dat');
    % load neural network structure
    network = cnn1;
    % Setup options for training and execute training
    options = trainingOptions('adam','MaxEpochs',nepoch,'MiniBatchSize',...
                               nbatch,'Shuffle','every-epoch',...
                              'InitialLearnRate',lr,...
                              'ExecutionEnvironment','gpu','Verbose',true);
    net = trainNetwork(ds,network,options);
 end
 function create_custom_image_format()
    fmts = imformats; % don't add if already in registry
    if ~any(contains([fmts.ext],'dat'))
        out.ext = 'dat';
        out.isa = @isdat;
        out.info = [];
        out.read = @custom_image_reader;
        out.write = [];
        out.alpha = 0;
        out.description = 'Custom Data Format';
        imformats('add',out);
    end
 end
 function tf = isdat(filename)
    % Returns true if file is type .dat
    [~,~,extn] = fileparts(filename);
    tf = strcmp(extn,'.dat');
 end
 function [X, map] = custom_image_reader(filename)
    dims = [$m $n]; % <-HARD CODE DIMENSIONS OF DATA HERE
    f=fopen(filename,'r');
    X = reshape(fread(f,'*int16'),dims(2), dims(1)).';
    fclose(f);
    map = [];
 end

Related Solutions

MATLAB: How to change color model of all pictures in image datastore

I hope I've figured this problem with Transform datastore:

First error

Error using trainNetwork (line 170)
Invalid transform function defined on datastore.
Error in transferlearning (line 37)
CNN = trainNetwork(dsnew.train, lgraph, opts);
Caused by:
    Error using nnet.internal.cnn.util.NetworkDataValidator/assertDatastoreHasResponses (line 140)
    Invalid transform function defined on datastore.
        Error using rgb2hsv>parseInputs (line 95)
        MAP must be a Mx3 array.

was figured out by using im2uint8 function.

im2uint8(rgb2hsv(dataIn));

Second error

Error using trainNetwork (line 170)
Invalid training data. Responses must be nonempty.

was figured out thanks to this script in transformFcn file:

function dataOut = transformFcn(dataIn)
B = table2cell(dataIn);
NoImg = size(B,1);
for i=1:NoImg
    B{i,1} = im2uint8(rgb2hsv(B{i,1}));
end
dataOut = cell2table(B);
end

In main file:

imds.train = imageDatastore(fullfile(trainDirectoryName, categories), 'LabelSource', 'foldernames');
imds.train = splitEachLabel(imds.train,5, 'randomize');
auimds.train = augmentedImageDatastore(sizeImg, imds.train);
dsnew.train = transform(auimds.train, @transformFcn);
dsnew.train.UnderlyingDatastore.MiniBatchSize = 10;

ReadFcn takes only one image, instead Transform datastore takes whole batch of images (size of this batch is defined in the last line: dsnew.train.UnderlyingDatastore.MiniBatchSize; default is 128).

Hope this helps :-)

MATLAB: Input folders or files do not contain specified file extensions:

Here if you do not define the ReadFcn explicitly, the default function used is imread() which is not supported for .nii extension.

Hence add ReadFcn as suggested by Walter Roberson.

pxds = pixelLabelDatastore(labelDir,classNames,labelIds,'FileExtensions','.nii','ReadFcn',sampleReadFcn); 
imds = imageDatastore(imageDir,'FileExtensions','.nii','ReadFcn',sampleReadFcn);

The custom ReadFcn looks something like the following

function data = sampleReadFcn(filename)
    data = niftiread(filename);
end

Refer here for more information on custom ReadFcn.

Refer niftiread for more info.

Best Answer

Related Solutions

MATLAB: How to change color model of all pictures in image datastore

MATLAB: Input folders or files do not contain specified file extensions:

Related Question