MATLAB: Is the DDPG episode rewards never change during the whole training process

ddpg trainingMATLABreinforcement learning

I'm training a DDPG agent using the Reinforcement Learning toolbox on MATLAB R2020a for a path planning problem. But as you can see, the DDPG episode rewards and average rewards never change during 5000 episodes. I used a simple neural networks with 20 neurons and three layers, the learning rate is set to 0.01, and the Gradient Threshold is 1. Then I try to set weights and bias for fully connected layers and change my reward function, but the result is the same.

I also saw at here that others have a similar problem. So any advice for my problem? Thank you.

Best Answer

Looks like the scale between Q0 and episode reward is very different. Try unchecking "Show Episode Q0" to see of the episode reward changes. I would then simplify the critic network to make sure it outputs values in a similar scale as the episode reward.

Related Solutions

MATLAB: How to extract a trained RL Agent’s network’s weights and biases

You can get the parameters from the trained's critic representation for DQN agent. In MATLAB R2020a, see getLearnableParameters and getCritic functions (function name changes a bit since R2019b). You can follow similar steps to get the actor's parameters from actor-based agent like DDPG or PPO.

critic = getCritic(agent);
criticParams = getLearnableParameters(critic);

MATLAB: Episode simulation doesn’t run while training DDPG

Hi Alice,

This example has not been set up to update the visualization during training. If you follow add a MATLAB Function block following this example you should be able to update it.

Best Answer

Related Solutions

MATLAB: How to extract a trained RL Agent’s network’s weights and biases

MATLAB: Episode simulation doesn’t run while training DDPG

Related Question