MATLAB: What is the number of LSTM cells in the network

celllstmMATLABsequencelengthtime step

Hello,
I wrote the following code :
data = readmatrix('Google_Stock_Price.csv');
data=data(:,2);
data=transpose(data);
mu = mean(data);
sig = std(data);
data_scaled = (data - mu) / sig;
ts=100; % time_steps
tn=5; % next target Observations (for multi step prediction) --> Forcasting
ss=1; % Stride
x = [];
y = [];
for i=ts:ss:length(data_scaled)-tn
x = [x; data_scaled(1,i+1-ts:i)];
y = [y; data_scaled(1,i+1:i+tn)]; % Forcasting
end
x_cell=cell(length(x),1); % length(x) = length(y)
y_cell=cell(length(y),1);
for i=1:length(x)
x_cell{i}=x(i,:);
y_cell{i}=transpose(y(i,:));
end
YTrain= y(1:length(y)*0.75,:);
YVal= y(length(y)*0.75+1:end,:);
XTrain_cell= x_cell(1:length(x_cell)*0.75);
XVal_cell= x_cell(length(x_cell)*0.75+1:end);
numFeatures = 1;
numResponses = tn;
numHiddenUnits = 200;
layers = [ ...
sequenceInputLayer(numFeatures)
lstmLayer(numHiddenUnits,'OutputMode','sequence')
%dropoutLayer(0.2)

lstmLayer(numHiddenUnits,'OutputMode','last')
%dropoutLayer(0.2)
fullyConnectedLayer(numResponses)
regressionLayer];
miniBatchSize = 31;
options = trainingOptions('adam', ...
'MaxEpochs',100, ...
'MiniBatchSize',miniBatchSize ,...
'ValidationFrequency',miniBatchSize , ...
'SequenceLength','longest',...
'Verbose',0,...
'Shuffle','once',...
'Plots','training-progress','ValidationData',{XVal_cell,YVal});
[net, info] = trainNetwork(XTrain_cell,YTrain,layers,options);
According to the last time steps (ts=100) I predicted the next five time steps (tn=5). in my question i will not focus on the Loss or RMSE. I want only to know how many LSTM cells (LSTM Blocks) that i have in this example. the SequenceLength (ts=100) is set to a fixed size , then Matlab would choose that length as the number of LSTM cells. please let me know if it was correct. see this please
Thanks

Best Answer

Hi Mohamad,
The "Sequence Padding, Truncation, and Splitting" section of the documentation page Long Short-Term Memory Networks has some helpful diagrams showing how sequences are padded when using different options of SequenceLength.
When training, the data is divided into mini-batches of size. The sequences in each mini-batch can have different length:
  • If SequenceLength is 'longest', then the function pads the sequences in the mini-batch so that they have the same length as the longest sequence in the mini-batch.
  • If SequenceLength is 'shortest', then the function truncates the sequences in the mini-batch so that they have the same length as the shortest sequence in the mini-batch.
  • If SequenceLength is a positive integer L, then the function pads the sequences in the mini-batch to the smallest multiple of L that is larger that the length of the longest sequence of the mini-batch. Then, if necessary, the function further splits mini-batch along the time-dimension to mini-batches with length L.