Azzera filtri
Azzera filtri

Error Training Q-Learning agent with Simulink Model

11 visualizzazioni (ultimi 30 giorni)
I am trying to train a Q-Learning agent on a Simulink Model "QLearnModel" with the following script called "Q_Gym":
states = [1:1:9];
actions = [-5:1:5];
%
observationInfo = rlFiniteSetSpec(states, "Name", "States");
actionInfo = rlFiniteSetSpec(actions, "Name", "Actions");
%
Q = rlTable(observationInfo, actionInfo);
critic = rlQValueFunction(Q,observationInfo,actionInfo);
agentOptions = rlQAgentOptions("SampleTime", 0.05);
agent = rlQAgent(critic,agentOptions);
%
environment = rlSimulinkEnv("QLearnModel","QLearnModel/RL Agent", observationInfo, actionInfo);
%
traininingOptions = rlTrainingOptions();
trainingStats = train(agent,environment,traininingOptions);
and when I run it this huge error list is thrown where "Error using rl.util.rlFiniteSetSpec/getElementIndex: Invalid data specified. The data must be an element of the rlFiniteSetSpec" is the most verbose one.
Error using rl.train.SeriesTrainer/run
There was an error executing the ProcessExperienceFcn for block
"QLearnModel/RL Agent".
Caused by:
Error using rl.function.AbstractFunction/gradient
Unable to compute gradient from function model.
Error in rl.agent.rlQAgent/learn_ (line 230)
[CriticGradient, gradInfo] =
gradient(this.Critic_,lossFcn,...
Error in rl.agent.AbstractAgent/learn (line 29)
this = learn_(this,experience);
Error in rl.util.agentProcessStepExperience (line 6)
learn(Agent,Exp);
Error in
rl.env.internal.FunctionHandlePolicyExperienceProcessor/processExperience_
(line 31)
[this.Policy_,this.Data_] = feval(this.Fcn_,...
Error in
rl.env.internal.ExperienceProcessorInterface/processExperienceInternal_
(line 139)
processExperience_(this,experience,infoData);
Error in
rl.env.internal.ExperienceProcessorInterface/processExperience
(line 78)
stopsim =
processExperienceInternal_(this,experience,simTime);
Error in rl.simulink.blocks.PolicyProcessExperience/stepImpl (line
45)
stopsim =
processExperience(this.ExperienceProcessor_,experience,simTime);
Error in Simulink.Simulation.internal.DesktopSimHelper
Error in Simulink.Simulation.internal.DesktopSimHelper.sim
Error in Simulink.SimulationInput/sim
Error in rl.env.internal.SimulinkSimulator>localSim (line 259)
simout = sim(in);
Error in
rl.env.internal.SimulinkSimulator>@(in)localSim(in,simPkg) (line
171)
simfcn = @(in) localSim(in,simPkg);
Error in MultiSim.internal.runSingleSim
Error in
MultiSim.internal.SimulationRunnerSerial/executeImplSingle
Error in MultiSim.internal.SimulationRunnerSerial/executeImpl
Error in Simulink.SimulationManager/executeSims
Error in Simulink.SimulationManagerEngine/executeSims
Error in rl.env.internal.SimulinkSimulator/simInternal_ (line 172)
simInfo =
executeSims(engine,simfcn,getSimulationInput(this));
Error in rl.env.internal.SimulinkSimulator/sim_ (line 78)
out = simInternal_(this,simPkg);
Error in rl.env.internal.AbstractSimulator/sim (line 30)
out =
sim_(this,simData,policy,processExpFcn,processExpData);
Error in rl.env.AbstractEnv/runEpisode (line 144)
out =
sim(simulator,simData,policy,processExpFcn,processExpData);
Error in rl.train.SeriesTrainer/run (line 64)
out = runEpisode(...
Error in rl.train.TrainingManager/train (line 516)
run(trainer);
Error in rl.train.TrainingManager/run (line 253)
train(this);
Error in rl.agent.AbstractAgent/train (line 187)
trainingResult = run(trainMgr,checkpoint);
Error in Q_Gym (line 20)
trainingStats = train(agent,environment,traininingOptions);
Caused by:
Error using rl.util.rlFiniteSetSpec/getElementIndex
Invalid data specified. The data must be an element of the
rlFiniteSetSpec.
Error in rl.train.TrainingManager/train (line 516)
run(trainer);
Error in rl.train.TrainingManager/run (line 253)
train(this);
Error in rl.agent.AbstractAgent/train (line 187)
trainingResult = run(trainMgr,checkpoint);
Error in Q_Gym (line 20)
trainingStats = train(agent,environment,traininingOptions);
It seems that the "observed observation" is not an element of rlFiniteSetSpec which is the vector defined in the script as:
states = [1:1:9];
What generates the observation is the following Simulink Block that outputs a number between 1 and 9 based on the realisation of two other variables:
I don't understand how the observation could not be in the defined set.

Risposte (1)

Yatharth
Yatharth il 14 Mar 2024
Hi Pietro,
Looking at the issue hand, I think that the issue might stem from how the observation is being generated or interpreted rather than the setup of your states and actions. To further investigate this, I recommend focusing on the following specific areas:
  1. Even though your simulink block is designed to output number between 1 and 9, ensure that the output is indeed an integer. Sometimes, due to floating-point arithmetic or block configuration, the output might not be exactly an integer even if it visually appears to be so. You might need to use a rounding or floor/ceiling block to ensure the output is an integer.
  2. Add a saturation block or a boundary check in your Simulink model to strictly enforce the output to be within the 1 to 9 range. This ensures that even if the calculations produce a value slightly outside this range, it's corrected before being passed as an observation.
A general debugging suggestion: Try to isolate the issue, you can insert, "To Workspace" Block in Simulink to monitor the output of the observation-generating block directly. This allows you to see exactly what values are being produced at each simulation step. Look for any values that are not integers or are outside the 1 to 9 range.
  1 Commento
Pietro Gualla
Pietro Gualla il 14 Mar 2024
Hi Yatharth,
thank you very much for your suggestion. As for the first one, the observation is already type cast to uint8 while, during these days I came up too with the second idea but unfortunately it did not help.
By the way I also swapped the Simulink block that generates the observation with the following function, which I found way more readable.
function obs = fcn(error_state, acc_state)
matrix = [ 1,2,3;
4,5,6;
7,8,9 ];
row = -error_state+2;
column = acc_state+2;
if (row >=1 && row<=3 && column >=1 && column<=3)
obs = int8(matrix(row,column));
else
obs = int8(2);
end
If, for an uncanny reason, the state should be a value not in matrix, then it returns 2 (which is a value in states for sure).
Still I get the same exact error. I honestly am at my wit's end since it doesn't make any sense to me.
Thank you again for your time and help!

Accedi per commentare.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by