MATLAB: Reinforcement learning and DDPG agent problem

ddpg algorithm

I used a deep reinforcement learning toolbox to path planning of a robot, including the DDPG algorithm. My scenario is that the robot starts from a random position and reaches the random goal location. After training, the result is a fixed path! And with changing the goal position, the path does not change. It is as if the network has learned only one path. The Drop-out layer is used in the network structure.

Does anyone have any idea what went wrong?

Best Answer

Looks like training was not successful. There could be many things at fault here - some suggestions:

1) Make sure you are randomizing the target locations at the beginning of each episode. It would help if you add visualization to actually verify targets move/debug the agent's behavior during training

2) The agent may not have enough information available to make decisions. Make sure the observations provide enough info to the agent

3) What does the episode manager plot look like when training stops? You may need to train the agent for more time

4) Why are you using a dropout layer? Unless your observations are images, this layer islikely not required (at least I don't think I have seen it in any shipping examples in Reinforcement Learning Toolbox). So your neural network architecture may also have something to do with this behavior.

Related Solutions

MATLAB: QTable reset when using train

If you stop training, you should be able to continue from where you left off. I called 'train' on the basic grid world example a couple of times in a row and the output of 'getLearnableParameters(getCritic(qAgent))' was different. You can always save the trained agent and reload it as well to make sure you don't accidentally delete it.

Update:

There is a regularization term added to the loss which causes the other entries to change slightly. To avoid this, you can type:

qRepresentation.Options.L2RegularizationFactor=0;

MATLAB: How to extract a trained RL Agent’s network’s weights and biases

You can get the parameters from the trained's critic representation for DQN agent. In MATLAB R2020a, see getLearnableParameters and getCritic functions (function name changes a bit since R2019b). You can follow similar steps to get the actor's parameters from actor-based agent like DDPG or PPO.

critic = getCritic(agent);
criticParams = getLearnableParameters(critic);

Best Answer

Related Solutions

MATLAB: QTable reset when using train

MATLAB: How to extract a trained RL Agent’s network’s weights and biases

Related Question