MATLAB: How to continue training a DQN agent in the reinforcement learning toolbox

I have created a neural network and DQN agent using the MATLAB reinforcement learning toolbox, using the following code

createEnvironment
createDQNetwork                            % Produces critic, criticOptions & GPU
createDQNOptions                           % Produces agentOptions
createDQNTrainingOptions                   % Produces trainOptions & parrallel processing
agent = rlDQNAgent(critic,agentOptions);   % Create the agent
validateEnvironment(env)

After this, I begin training the agent using the following code.

trainingResults = train(agent,env,trainOptions);
curDir = pwd;
saveDir = 'savedAgents';
cd(saveDir)
save(['trainedAgent' datestr(now,'mm_DD_YYYY_HHMM')],'agent','-v7.3');
% save(['trainedAgent' datestr(now,'mm_DD_YYYY_HHMM')],'agent','trainingResults','-v7.3');
cd(curDir)

The agent begins training succesfully and I can observe it is learning how to control the system. Due to system memory constraints, I need to run the training process multiple times. When the first training process is finished, I simply run the following command again:

trainingResults = train(agent,env,trainOptions);

as I don't need to create a brand new agent, network, environment etc. from scratch. However, the behaviour of the agent when training begins the second time has obviously reverted back to what is was when it was first created. How can I begin retraining the agent, while keeping the progress from the previous training session?

Edit: My system has 64GB of RAM, getting more isn't really an option….

Best Answer

Hi James,

It looks like the experience buffer is the culprit here. Have a look at this question for a suggestion. Pretty much you need to make sure you also save the experience buffer when you stop training. I would also recommend reducing the size of the experience buffer just enough to reduce memory utilization and make it feasible to train in one go.

Best Answer

Related Solutions

MATLAB: How to extract a trained RL Agent’s network’s weights and biases

MATLAB: How to TRAIN further a previously trained agent

Related Question