(I think)getValue function produces "Not enough input arguments"
Mostra commenti meno recenti
I am trying to create a minimax-q agent by changing the rlQAgent implementation in the Matlab folder. I directly copy the learn function from rlQAgent below:
function action = learn(this,exp)
% learn from current experiences, return action with exploration
% exp = {state,action,reward,nextstate,isdone}
Observation = exp{1};
Action = exp{2};
Reward = exp{3};
NextObservation = exp{4};
Done = exp{5};
if Done == 1
% for final step, just use the immediate reward, since
% there is no more a next state
QTargetEstimate = Reward;
else
TargetQValue = getMaxQValue(this.Critic, NextObservation);
% x = getValue(this.Critic, NextObservation);
QTargetEstimate = Reward + this.AgentOptions.DiscountFactor * TargetQValue;
end
% update the critic with computed target
CriticGradient = gradient(this.Critic,'loss-parameters',...
[Observation,Action], QTargetEstimate);
this.Critic = optimize(this.Critic, CriticGradient);
% update exploration model
this.ExplorationModel = update(this.ExplorationModel);
% compute action from the current policy
action = getActionWithExploration(this, NextObservation);
end
I try to change the following
TargetQValue = getMaxQValue(this.Critic, NextObservation);
into
x = getValue(this.Critic, NextObservation);
this. I know that getValue function is supposed to return a matrix. Therefore, in the following line
QTargetEstimate = Reward + this.AgentOptions.DiscountFactor * TargetQValue;
I put a breakpoint to prevent further errors. This program terminates prematurely and does not return an "x" matrix producing "Not enough input arguments." error. Can anyone tell me what the problem is? Thanks.
Edit:
Error using rl.agent.AbstractPolicy/step (line 116)
Invalid input argument type or size such as observation, reward, isdone or loggedSignals.
Error in rl.env.MATLABEnvironment/simLoop (line 241)
action = step(policy,observation,reward,isdone);
Error in rl.env.MATLABEnvironment/simWithPolicyImpl (line 106)
[expcell{simCount},epinfo,siminfos{simCount}] = simLoop(env,policy,opts,simCount,usePCT);
Error in rl.env.AbstractEnv/simWithPolicy (line 70)
[experiences,varargout{1:(nargout-1)}] = simWithPolicyImpl(this,policy,opts,varargin{:});
Error in rl.task.SeriesTrainTask/runImpl (line 33)
[varargout{1},varargout{2}] = simWithPolicy(this.Env,this.Agent,simOpts);
Error in rl.task.Task/run (line 21)
[varargout{1:nargout}] = runImpl(this);
Error in rl.task.TaskSpec/internal_run (line 159)
[varargout{1:nargout}] = run(task);
Error in rl.task.TaskSpec/runDirect (line 163)
[this.Outputs{1:getNumOutputs(this)}] = internal_run(this);
Error in rl.task.TaskSpec/runScalarTask (line 187)
runDirect(this);
Error in rl.task.TaskSpec/run (line 69)
runScalarTask(task);
Error in rl.train.SeriesTrainer/run (line 24)
run(seriestaskspec);
Error in rl.train.TrainingManager/train (line 291)
run(trainer);
Error in rl.train.TrainingManager/run (line 160)
train(this);
Error in rl.agent.AbstractAgent/train (line 54)
TrainingStatistics = run(trainMgr);
Error in gridWorld (line 34)
trainingStats = train(qAgent,env,trainOpts);
Caused by:
Not enough input arguments.
This is the whole error log. Please note that in the rlQAgent.m doc, I only change the above proposed line, everything else is exactly the same.
6 Commenti
Walter Roberson
il 19 Lug 2020
What shows up for
which getValue(this.Critic)
Cengiz Aksakal
il 19 Lug 2020
Walter Roberson
il 19 Lug 2020
Modificato: Walter Roberson
il 19 Lug 2020
What shows up if you command
class(this.Critic)
which getValue(this.Critic, NextObservation)
which getValue(this.Critic, NextObservation, 1)
getValue(this.Critic, NextObservation, 1)
Note that this are not changes to your code: they are debugging commands to be executed at a breakpoint inside your code at the point that you would want to do the getValue() call.
Cengiz Aksakal
il 20 Lug 2020
Walter Roberson
il 20 Lug 2020
What code sequence and data is needed to trigger the error?
(Though I do not think I have the necessary toolboxes to test with.)
Cengiz Aksakal
il 20 Lug 2020
Modificato: Cengiz Aksakal
il 20 Lug 2020
Risposta accettata
Più risposte (0)
Categorie
Scopri di più su Training and Simulation in Centro assistenza e File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!