MATLAB: Invalid observation type or size. error in simulink varies on quantization interval constraining observation signals in Simulink (Reinforcement Learning Toolbox)

q-learningreinforcement learningReinforcement Learning Toolbox

I am trying to train q-learning agent to control a heating system. I set observation finitesetspec values as -6:0.1:6 and action finitesetspec values as [-5;-2;-1;0;1;2;5]. The observation signal in Simulink is restrained with saturation block (-6 to 6) and quantization block.
I would like to know why if I'm setting quantization interval to 0.1 i get the error "Invalid observation type or size.", but if I set quantization interval to 0.5 or 1 then there is no issue and I can train the agent.
obsInfo = rlFiniteSetSpec(-6:0.1:6);
obsInfo.Name = 'observations';
obsInfo.Description = 'err';
numObservations = obsInfo.Dimension(1);
actInfo = rlFiniteSetSpec([-5;-2;-1;0;1;2;5]');
actInfo.Name = 'krok';
numActions = actInfo.Dimension(1);
env = rlSimulinkEnv('gmpsn','gmpsn/RL Agent',...
obsInfo,actInfo);
env.ResetFcn = @(in)localResetFcn(in);
Ts = 60;
Tf = 50000;
qTable = rlTable(getObservationInfo(env),getActionInfo(env));
critic = rlQValueRepresentation(qTable,getObservationInfo(env),getActionInfo(env));
agentOpts = rlQAgentOptions(...
'SampleTime',Ts,...
'DiscountFactor',0.99);
agentOpts.EpsilonGreedyExploration.Epsilon=0.9;
agent = rlQAgent(critic,agentOpts);
maxepisodes = 5000;
maxsteps = ceil(Tf/Ts);
trainOpts = rlTrainingOptions(...
'MaxEpisodes',maxepisodes, ...
'MaxStepsPerEpisode',maxsteps, ...
'ScoreAveragingWindowLength',20, ...
'Verbose',false, ...
'Plots','training-progress',...
'StopTrainingCriteria','AverageReward',...
'StopTrainingValue',800);
trainingStats = train(agent,env,trainOpts);
Validate Trained Agent
Validate the learned agent against the model by simulation.
simOpts = rlSimulationOptions('MaxSteps',maxsteps,'StopOnError','on');
experiences = sim(env,agent,simOpts);
Local Function
function in = localResetFcn(in)
blk = sprintf('gmpsn/Tzad');
h = 19+rand(1)*2;
in = setBlockParameter(in,blk,'Value',num2str(h));
h=15+rand(1)*3;
blk = 'gmpsn/Obiekt/Twwew';
in = setBlockParameter(in,blk,'Value',num2str(h));
blk = 'gmpsn/Obiekt/Tgg';
in = setBlockParameter(in,blk,'Value',num2str(h));
blk = 'gmpsn/Obiekt/Tss';
in = setBlockParameter(in,blk,'Value',num2str(h));
end
error that I get when try to train the agent with quantization interval set to 0.1
Error using rl.env.AbstractEnv/simWithPolicy (line 70)
An error occurred while simulating "gmpsn" with the agent "agent".
Error in rl.env.AbstractEnv/sim (line 135)
[experiencesCell,simInfo] = simWithPolicy(env,policy,opts);
Caused by:
Error using rl.env.SimulinkEnvWithAgent>localHandleSimoutErrors (line 689)
Invalid observation type or size.
Error using rl.env.SimulinkEnvWithAgent>localHandleSimoutErrors (line 689)
Invalid observation type or size.
Error using rl.env.SimulinkEnvWithAgent>localHandleSimoutErrors (line 689)
Unable to evaluate representation.
Error using rl.env.SimulinkEnvWithAgent>localHandleSimoutErrors (line 689)
Invalid data specified. The data must be an element of the rlFiniteSetSpec.

Best Answer

Hello,
This is likely due to numerical effects of rounding that happens when quantizing (see doc here). When quantization interval is e.g. 0.5, you only need to shift bits when multiplying which is why the numerical problem does not show up then.
To work around this, you could either
1) Use DQN agent which takes in continuous observations
2) Use Q learning as you are doing now but scale your observations to be integers. So instead of
obsInfo = rlFiniteSetSpec(-6:0.1:6);
you can do
obsInfo = rlFiniteSetSpec(-60:1:60);
Then, assuming your Simulink model is outputting values between -6:0.1:6, you can multiply this value by 10 and then convert it to an integer using this block. That way you will ensure that there are no numerical errors and the observations will be compatible with the rlFiniteSetSpec definition.
Hope that helps
Related Question