MATLAB: BayesOpt for Custom Neural Network

bayesian optimizationmachine learningneural networkneural networksoptimizationStatistics and Machine Learning Toolbox

I'm trying to use Bayesian Optimization for my custom neural network, but when following the tutorials, it isn't clear how I can use BayesOpt with my own network.
My current understanding is I need to use the Validation and Training losses as inputs to the objective function, but I'm sort of at a loss on how to do this.
What I'm Trying to Tune:
  • Hyperparameters (max epochs (just to get in the neighborhood), minibatchsize, initial learning rates, etc.)
  • Number of hidden layers
  • The size of my fully connected layers
Right now I'm doing this iteratively, but I'm looking for a more optimal solution (hence BayesOpt)
This is what I have setup currently in order to iterate over different epochs, hidden layers, and the size of fully connected layers. I know I can add in the other hyperparameters to trainingOptions, but this is all I'm iterating over at the moment, and am leaving the rest to default values.
Thoughts?
function [net,tr] = betNet(X,y,X_test,y_test,X_cv,y_cv,maxE,NHL,fcls)
%fcls = Fully Connected Layer Size
%NHL = Number of Hidden Layers
%maxE = Maximum Epochs
%% ===== Setting up DNN =====
%Sets up our FCL
fcl1 = fullyConnectedLayer(fcls,'BiasInitializer','narrow-normal');
fcl2 = fullyConnectedLayer(2,'BiasInitializer','ones');
ip = sequenceInputLayer(size(X,1));
sml = softmaxLayer('Name','sml');
options = trainingOptions('adam',...
'MaxEpochs',maxE,...
'ExecutionEnvironment','gpu',...
'Shuffle','every-epoch',...
'MiniBatchSize',64,...
'ValidationFrequency',50,...
'ValidationData',{X_cv,y_cv})
% 'Plots','training-progress',...
% layers = [repmat(fcl,1,10) sigmoidLayer classificationLayer]
layers = [ip repmat(fcl1,1,NHL) fcl2 softmaxLayer classificationLayer];
%% ===== Training NN =====
[net,tr] = trainNetwork(X,y,layers,options);
end

Best Answer

I believe I solved it. I'm letting it run and will find out when it's finished.
% bayesNet
% (X,y,X_test,y_test,X_cv,y_cv,maxE,NHL,fcls)
X = Training_Data.X;
Y = Training_Data.y;
X_cv = Training_Data.X_cv;
Y_cv = Training_Data.y_cv;
%%Choose Variables to Optimize
%MBS = MiniBatchSize
mbsRange = [10 120];
%maxE = maximum epochs
maxErange = [100 10000];
%NHL = number of hidden layers
nhlRange = [5 200];
%fcls = Fully Connected Layer Size
fclsRange = [100 size(Training_Data.X,1)];
optimVars = [
optimizableVariable('MiniBatchSize',mbsRange,'Type','integer');
optimizableVariable('maxEpochs',maxErange,'Type','integer');
optimizableVariable('NHL',nhlRange,'Type','integer');
optimizableVariable('fcls',fclsRange,'Type','integer');
optimizableVariable('InitialLearnRate',[1e-2 1],'Transform','log')];
%%Perform Bayesian Optimization
ObjFcn = makeObjFcn(Training_Data.X,Training_Data.y,Training_Data.X_cv,Training_Data.y_cv);
BayesObject = bayesopt(ObjFcn,optimVars,...
'MaxObj',30,...
'MaxTime',8*60*60,...
'IsObjectiveDeterministic',false,...
'UseParallel',false);
%%Evaluate Final Network
bestIdx = BayesObject.IndexOfMinimumTrace(end);
fileName = BayesObject.UserDataTrace{bestIdx};
load(fileName);
[Y_predicted,scores] = classify(net,test);
testError = perform(net,test_truth_cat,Y_predicted);
testError
valError
%%Objective Function for Optimization
function ObjFcn = makeObjFcn(X,Y,X_cv,Y_cv)
ObjFcn = @valErrorFun;
function [valError,cons,fileName] = valErrorFun(optVars)
%Imports our current optVars for layer construction
NHL = optVars.NHL
fcls = optVars.fcls
%% ===== Setting up DNN =====
%Sets up our FCL
fcl1 = fullyConnectedLayer(fcls,'BiasInitializer','narrow-normal');
fcl2 = fullyConnectedLayer(2,'BiasInitializer','ones');
%lstm = lstmLayer(size(X,1))
% ip = sequenceInputLayer(size(X,1),'Normalization','zerocenter');
ip = sequenceInputLayer(size(X,1));
sml = softmaxLayer('Name','sml');
options = trainingOptions('adam',...
'MaxEpochs',optVars.maxEpochs,...
'ExecutionEnvironment','gpu',...
'Shuffle','every-epoch',...
'MiniBatchSize',optVars.MiniBatchSize,...
'InitialLearnRate',optVars.InitialLearnRate,...
'ValidationFrequency',50,...
'ValidationData',{X_cv,Y_cv})
% 'Plots','training-progress',...
% layers = [repmat(fcl,1,10) sigmoidLayer classificationLayer]
layers = [ip repmat(fcl1,1,NHL) fcl2 softmaxLayer classificationLayer];
%
% global training_state
% training_state = [];
%% ===== Training NN =====
[net,~] = trainNetwork(X,Y,layers,options);
YPredicted = classify(net,test);
valError = perform(net,test_truth_cat,YPredicted);
fileName = num2str(valError)+".mat";
save(fileName,'net','valError');
cons = [];
end
end