I have a single Tesla GP100 GPU with 16GB of RAM. When I'm training my neural network, I have two issues
- Using a imageDatastore spends a HUGE amount of time doing an fread (I'm using a custom ReadFcn because my data is asymmetric and that seemed easiest). I am able to overcome this by reading all the data into memory prior to training but that will not scale.
- During training I am only using 2.2GB of the 16GB available on the GPU. When I use the exact same network and data with TensorFlow, I use all 16GB. This is the case even if I preload all the data above into memory. I'm guessing that is because TensorFlow is "queuing up" batches and MATLAB is not. Is there a way to increase this?
Here is my minimum example code:
function net = run_training_public(dims, nbatch, lr, nepoch) % Load Data
ds = imageDatastore('./data/set3', 'IncludeSubfolders',true,... 'ReadFcn',@(x)reader_public(x,dims),... 'LabelSource','foldernames',... 'FileExtensions','.dat'); % load neural network structure
network = cnn1; % Setup options for training and execute training
options = trainingOptions('adam','MaxEpochs',nepoch,'MiniBatchSize',... nbatch,'Shuffle','every-epoch',... 'InitialLearnRate',lr,... 'ExecutionEnvironment','gpu','Verbose',true); net = trainNetwork(ds,network,options); end function data = reader_public(fileName, dims) f=fopen(fileName,'r'); data = fread(f,[dims(2) dims(1)],'*int16').'; fclose(f); end
Best Answer