MATLAB: Error using GPU in Matlab 2020a for Reinforcement Learning

gpuMATLABParallel Computing Toolboxreinforcement learningReinforcement Learning Toolboxrlrepresentationoptionssimulink

I keep running into this error when using 'UseDevice',"gpu" in rlRepresentationOptions. The issue seems to appear after the the simulation happens for random period of time. I have tried this with multiple built-in examples and with both DDPG and TD3 agent. Could someone direct me if I am doing something wrong or is this a bug?
Error using rl.env.AbstractEnv/simWithPolicy (line 70)
An error occurred while simulating "IntegratedFlyingRobot" with the agent "agent".
Error in rl.task.SeriesTrainTask/runImpl (line 33)
[varargout{1},varargout{2}] = simWithPolicy(this.Env,this.Agent,simOpts);
Error in rl.task.Task/run (line 21)
[varargout{1:nargout}] = runImpl(this);
Error in rl.task.TaskSpec/internal_run (line 159)
[varargout{1:nargout}] = run(task);
Error in rl.task.TaskSpec/runDirect (line 163)
[this.Outputs{1:getNumOutputs(this)}] = internal_run(this);
Error in rl.task.TaskSpec/runScalarTask (line 187)
runDirect(this);
Error in rl.task.TaskSpec/run (line 69)
runScalarTask(task);
Error in rl.train.SeriesTrainer/run (line 24)
run(seriestaskspec);
Error in rl.train.TrainingManager/train (line 291)
run(trainer);
Error in rl.train.TrainingManager/run (line 160)
train(this);
Error in rl.agent.AbstractAgent/train (line 54)
TrainingStatistics = run(trainMgr);
Caused by:
Error using rl.env.SimulinkEnvWithAgent>localHandleSimoutErrors (line 689)
Invalid input argument type or size such as observation, reward, isdone or loggedSignals.
Error using rl.env.SimulinkEnvWithAgent>localHandleSimoutErrors (line 689)
Unable to compute gradient from representation.
Error using rl.env.SimulinkEnvWithAgent>localHandleSimoutErrors (line 689)
Unable to evaluate the loss function. Check the loss function and ensure it runs successfully.
Error using rl.env.SimulinkEnvWithAgent>localHandleSimoutErrors (line 689)
Input data dimensions must match the dimensions specified in the corresponding observation and action info specifications.