MATLAB: How to give the agent reward after simulation is done instead of giving it simultaneously

reinforcement learning

I'm on a project which trys to tune a motor's PID controller parameters using reinforcement learning. Here is my idea:

I can modify Matlab's rlwatertank example, replacing the watertank with a motor + PID controllers. The output of the agent is the Kp, Ki and Kd gains of PID. After the agent outputs the Kp, Ki and Kd, I can run simulation to see the errors of my motor's step response, like overshooting percentage, settling time or steady-state error, etc. Then I use these errors to calculate the reward and sending it to the agent.

The problem is I don't know how to give the agent reward after each simulation is done, instead of giving it reward while simulation is still running. Anyone has idea?

Thanks a lot in advance.

Best Answer

Hi,

I understand you are using MATLAB’s rlwatertank example as a base for training motor + PID controller and want to update the agent award after the end of simulation rather than updating while simulation.

You may follow the below steps:

Train the agent for first iteration with specific number of episodes and then stop training.
Calculate the motor-PID parameters that are mentioned in the question after simulating it once.
Calculate the reward from those parameters with your own logic.
Update the agent with the reward you calculate with previous step.
Initialize this agent as a starting point of next iteration and then you may go to first step for updating the agent again based on the Parameters.

You can use a for loop for doing the training multiple times.

For Initializing the agent, you can load the pretrained agent and set the agent for current iteration. For Tuning other parameters, you can refer to rlTrainingOptions and rlDDPGAgentOptions.

Related Solutions

MATLAB: How to extract a trained RL Agent’s network’s weights and biases

You can get the parameters from the trained's critic representation for DQN agent. In MATLAB R2020a, see getLearnableParameters and getCritic functions (function name changes a bit since R2019b). You can follow similar steps to get the actor's parameters from actor-based agent like DDPG or PPO.

critic = getCritic(agent);
criticParams = getLearnableParameters(critic);

MATLAB: Is it possible to tune PID controller using “gamultiobj”

Yes it is possible. <http://www.maxwellsci.com/print/rjaset/v7-1116-1122.pdf Optimization of PID Controller for Brushless DC Motor by using Bio-inspired Algorithms > See the link

Best Answer

Related Solutions

MATLAB: How to extract a trained RL Agent’s network’s weights and biases

MATLAB: Is it possible to tune PID controller using “gamultiobj”

Related Question