MATLAB: Transform and Combine fileDatastores to Train CNN

combinefiledatastoretrainnetworktransform

Hi I would appreciatte your help,
I have a collection of 50x1x12 mat files, that I need to upload into some datastore to subsequently pass into a convolutional neural network, how can I Train my
but instead of having .mat file for the labels I am using this function to get the labels from the folders.
function label = readLabel(filename,classNames)
filepath = fileparts(filename);
[~,label] = fileparts(filepath);
label = categorical(string(label),classNames);
end
I am uploading the files into the fileDatastore
inputData=fileDatastore(fullfile('inputData'),'ReadFcn',@load,'FileExtensions','.mat');
%getting the labels from the folders
classNames = string(1:2)
targetData=fileDatastore(fullfile('targetData'),'ReadFcn',@load,'FileExtensions','.mat',readLabel(filename,classNames));
Should I transform before combine the filedatastores? if so, how should my transform function be?
I tried combining without transforming and I got this error when I tried to train my CNN: (trainNetwork)
%combininig
cdsTrain = combine(inputData,targetData);
%Training
net = trainNetwork(cdsTrain,layers,options);
"The training images are of size 1x1x1 but the input layer expects images of size 50x1x12"
Thanks for your help!

Best Answer

Hi, this is my code. I know it is not efficient, but it should work! I hope it can help as a starting point to be optimized.
For simplicity, I added information about category inside each .mat file as string variable. However, I'm sure you can get labels from folder names too.
The key idea is to use fileDatastore for input data and tabularTextDatastore for categorical data. Then you can combine them succesfully.
Regards
clear
close all
clc
%% create dummy inputs and targets (.mat file)
% inputs are images of size 50x1x12 [heigth, width, channels]
% target are type of material: categorical array
mkdir trainingData
cd trainingData
n=10; % number of training cases
for i=1:n
material=ones(50,1,12)*i;
if i<n/2
matType='typeA';
else
matType='typeB';
end
mkdir(matType)
filename=sprintf('material_%d', i);
save(fullfile(matType,filename),'material', 'matType')
end
cd ..
%% load data
allData=fileDatastore(fullfile('trainingData'),'ReadFcn',@load,'FileExtensions','.mat', 'IncludeSubfolders', true);
%% create inputs
inputData = transform(allData,@(data) rearrange_input(data)); % extract and rearrange input
%% create targets (can be optimized...)
targetData = transform(allData,@(data) rearrange_target(data));
myLabels=targetData.readall;
writematrix(myLabels,'myLabels.txt');
labelStore = tabularTextDatastore('myLabels.txt','TextscanFormats','%C',"ReadVariableNames",false);
read_size=1; % this line is foundamental... I don't know why...
labelStore.ReadSize = read_size;
labelStoreCell = transform(labelStore,@setcat_and_table_to_cell);
%% combininig
cdsTrain = combine(inputData,labelStoreCell);
%% training
% dummy layers and options
numClasses=2;
layers=[
imageInputLayer([50 1 12],"Name","imageinput")
batchNormalizationLayer()
leakyReluLayer(0.1)
fullyConnectedLayer(numClasses,'Name','fc')
softmaxLayer('Name','soft')
classificationLayer('Name','classification')];
options = trainingOptions('adam');
net = trainNetwork(cdsTrain,layers,options);
%% functions
function inputData = rearrange_input(data)
inputData=data.material;
inputData= {inputData};
end
function targetData = rearrange_target(data)
targetData=data.matType;
targetData=categorical(cellstr(targetData));
end
function [dataout] = setcat_and_table_to_cell(datain)
validcats = ["typeA", "typeB"];
datain.(1) = setcats(datain.(1),validcats);
dataout = table2cell(datain);
end