I'm training a DDPG agent using the Reinforcement Learning toolbox on MATLAB R2020a for a path planning problem. But as you can see, the DDPG episode rewards and average rewards never change during 5000 episodes. I used a simple neural networks with 20 neurons and three layers, the learning rate is set to 0.01, and the Gradient Threshold is 1. Then I try to set weights and bias for fully connected layers and change my reward function, but the result is the same.
MATLAB: Is the DDPG episode rewards never change during the whole training process
ddpg trainingMATLABreinforcement learning
Best Answer