Problems encountered in matlab reinforcement learning toolbox

10 visualizzazioni (ultimi 30 giorni)
Hi,I encountered some problems when building a model with the reinforcement learning toolbox, and his error was as follows:
警告: 执行为类 rl.env.SimulinkEnvWithAgent 定义的事件 EpisodeFinished 的侦听程序回调时出现错误:
错误使用 rl.policy.AbstractPolicy/getAction (第 258 行)
Invalid observation type or size.
出错 rl.agent.rlDDPGAgent/evaluateQ0Impl (第 141 行)
action = getAction(this, observation);
出错 rl.agent.AbstractAgent/evaluateQ0 (第 275 行)
q0 = evaluateQ0Impl(this,observation);
出错 rl.train.TrainingManager/update (第 134 行)
q0 = evaluateQ0(this.Agents(idx),epinfo(idx).InitialObservation);
出错 rl.train.TrainingManager>@(info)update(this,info) (第 437 行)
trainer.FinishedEpisodeFcn = @(info) update(this,info);
出错 rl.train.Trainer/notifyEpisodeFinishedAndCheckStopTrain (第 56 行)
stopTraining = this.FinishedEpisodeFcn(info);
出错 rl.train.SeriesTrainer>iUpdateEpisodeFinished (第 31 行)
notifyEpisodeFinishedAndCheckStopTrain(this,ed.Data);
出错 rl.train.SeriesTrainer>@(src,ed)iUpdateEpisodeFinished(this,ed) (第 17 行)
@(src,ed) iUpdateEpisodeFinished(this,ed));
出错 rl.env.AbstractEnv/notifyEpisodeFinished (第 324 行)
notify(this,'EpisodeFinished',ed);
出错 rl.env.SimulinkEnvWithAgent/executeSimsWrapper/nestedSimFinishedBC (第 222 行)
notifyEpisodeFinished(this,...
出错 rl.env.SimulinkEnvWithAgent>@(src,ed)nestedSimFinishedBC(ed) (第 232 行)
simlist(1) = event.listener(this.SimMgr,'SimulationFinished' ,@(src,ed) nestedSimFinishedBC(ed));
出错 Simulink.SimulationManager/handleSimulationOutputAvailable
出错 Simulink.SimulationManager>@(varargin)obj.handleSimulationOutputAvailable(varargin{:})
出错 MultiSim.internal.SimulationRunnerSerial/executeImplSingle
出错 MultiSim.internal.SimulationRunnerSerial/executeImpl
出错 Simulink.SimulationManager/executeSims
出错 Simulink.SimulationManagerEngine/executeSims
出错 rl.env.SimulinkEnvWithAgent/executeSimsWrapper (第 244 行)
executeSims(this.SimEngine,simfh,in);
出错 rl.env.SimulinkEnvWithAgent/simWrapper (第 267 行)
simouts = executeSimsWrapper(this,in,simfh,simouts,opts);
出错 rl.env.SimulinkEnvWithAgent/simWithPolicyImpl (第 424 行)
simouts = simWrapper(env,policy,simData,in,opts);
出错 rl.env.AbstractEnv/simWithPolicy (第 82 行)
[experiences,varargout{1:(nargout-1)}] = simWithPolicyImpl(this,policy,opts,varargin{:});
出错 rl.task.SeriesTrainTask/runImpl (第 33 行)
[varargout{1},varargout{2}] = simWithPolicy(this.Env,this.Agent,simOpts);
出错 rl.task.Task/run (第 21 行)
[varargout{1:nargout}] = runImpl(this);
出错 rl.task.TaskSpec/internal_run (第 166 行)
[varargout{1:nargout}] = run(task);
出错 rl.task.TaskSpec/runDirect (第 170 行)
[this.Outputs{1:getNumOutputs(this)}] = internal_run(this);
出错 rl.task.TaskSpec/runScalarTask (第 194 行)
runDirect(this);
出错 rl.task.TaskSpec/run (第 69 行)
runScalarTask(task);
出错 rl.train.SeriesTrainer/run (第 24 行)
run(seriestaskspec);
出错 rl.train.TrainingManager/train (第 421 行)
run(trainer);
出错 rl.train.TrainingManager/run (第 211 行)
train(this);
出错 rl.agent.AbstractAgent/train (第 78 行)
TrainingStatistics = run(trainMgr);
出错 simulink_pmsm_ddpg (第 75 行)
trainingStats = train(agent,env,trainingOptions);
原因:
错误使用 rl.representation.rlAbstractRepresentation/validateInputData (第 525 行)
Input data dimensions must match the dimensions specified in the corresponding observation and action info specifications.
> 位置:rl.env/AbstractEnv/notifyEpisodeFinished (第 324 行)
位置: rl.env.SimulinkEnvWithAgent.executeSimsWrapper/nestedSimFinishedBC (第 222 行)
位置: rl.env.SimulinkEnvWithAgent>@(src,ed)nestedSimFinishedBC(ed) (第 232 行)
位置: Simulink/SimulationManager/handleSimulationOutputAvailable
位置: Simulink.SimulationManager>@(varargin)obj.handleSimulationOutputAvailable(varargin{:})
位置: MultiSim.internal/SimulationRunnerSerial/executeImplSingle
位置: MultiSim.internal/SimulationRunnerSerial/executeImpl
位置: Simulink/SimulationManager/executeSims
位置: Simulink/SimulationManagerEngine/executeSims
位置: rl.env/SimulinkEnvWithAgent/executeSimsWrapper (第 244 行)
位置: rl.env/SimulinkEnvWithAgent/simWrapper (第 267 行)
位置: rl.env/SimulinkEnvWithAgent/simWithPolicyImpl (第 424 行)
位置: rl.env/AbstractEnv/simWithPolicy (第 82 行)
位置: rl.task/SeriesTrainTask/runImpl (第 33 行)
位置: rl.task/Task/run (第 21 行)
位置: rl.task/TaskSpec/internal_run (第 166 行)
位置: rl.task/TaskSpec/runDirect (第 170 行)
位置: rl.task/TaskSpec/runScalarTask (第 194 行)
位置: rl.task/TaskSpec/run (第 69 行)
位置: rl.train/SeriesTrainer/run (第 24 行)
位置: rl.train/TrainingManager/train (第 421 行)
位置: rl.train/TrainingManager/run (第 211 行)
位置: rl.agent.AbstractAgent/train (第 78 行)
位置: simulink_pmsm_ddpg (第 75 行)
错误使用 rl.env.AbstractEnv/simWithPolicy (第 82 行)
An error occurred while simulating "rlpmsmSimscapeModel" with the agent "agent".
出错 rl.task.SeriesTrainTask/runImpl (第 33 行)
[varargout{1},varargout{2}] = simWithPolicy(this.Env,this.Agent,simOpts);
出错 rl.task.Task/run (第 21 行)
[varargout{1:nargout}] = runImpl(this);
出错 rl.task.TaskSpec/internal_run (第 166 行)
[varargout{1:nargout}] = run(task);
出错 rl.task.TaskSpec/runDirect (第 170 行)
[this.Outputs{1:getNumOutputs(this)}] = internal_run(this);
出错 rl.task.TaskSpec/runScalarTask (第 194 行)
runDirect(this);
出错 rl.task.TaskSpec/run (第 69 行)
runScalarTask(task);
出错 rl.train.SeriesTrainer/run (第 24 行)
run(seriestaskspec);
出错 rl.train.TrainingManager/train (第 421 行)
run(trainer);
出错 rl.train.TrainingManager/run (第 211 行)
train(this);
出错 rl.agent.AbstractAgent/train (第 78 行)
TrainingStatistics = run(trainMgr);
出错 simulink_pmsm_ddpg (第 75 行)
trainingStats = train(agent,env,trainingOptions);
原因:
错误使用 rl.env.SimulinkEnvWithAgent>localHandleSimoutErrors (第 681 行)
Invalid observation type or size.
错误使用 rl.env.SimulinkEnvWithAgent>localHandleSimoutErrors (第 681 行)
Input data dimensions must match the dimensions specified in the corresponding observation and action info specifications.
I don’t know what caused this, I don’t know how to change it, please help

Risposte (3)

ying Zhao
ying Zhao il 24 Mar 2021
This is my code:
%% 读取环境
clc;
mdl = 'rlpmsmSimscapeModel';
open_system(mdl)
env = rlSimulinkEnv('rlpmsmSimscapeModel','rlpmsmSimscapeModel/RL Agent');
obsInfo = getObservationInfo(env);
numObservations = obsInfo.Dimension(1);
actInfo = getActionInfo(env);
numActions = actInfo.Dimension(1);
%%
Ts = 0.02;
Tf = 1;
rng(0)
%% 初始化agent
statePath = [
imageInputLayer([numObservations 1 1],'Normalization','none','Name','State')
fullyConnectedLayer(50,'Name','CriticStateFC1')
reluLayer('Name','CriticRelu1')
fullyConnectedLayer(25,'Name','CriticStateFC2')];
actionPath = [
imageInputLayer([numActions 1 1],'Normalization','none','Name','Action')
fullyConnectedLayer(25,'Name','CriticActionFC1','BiasLearnRateFactor',0)];
commonPath = [
additionLayer(2,'Name','add')
reluLayer('Name','CriticCommonRelu')
fullyConnectedLayer(1,'Name','CriticOutput')];
criticNetwork = layerGraph(statePath);
criticNetwork = addLayers(criticNetwork,actionPath);
criticNetwork = addLayers(criticNetwork,commonPath);
criticNetwork = connectLayers(criticNetwork,'CriticStateFC2','add/in1');
criticNetwork = connectLayers(criticNetwork,'CriticActionFC1','add/in2');
figure
plot(criticNetwork)
criticOptions = rlRepresentationOptions('LearnRate',1e-03,'GradientThreshold',1);
critic = rlRepresentation(criticNetwork,obsInfo,actInfo,...
'Observation',{'State'},'Action',{'Action'},criticOptions);
actorNetwork = [
imageInputLayer([numObservations 1 1], 'Normalization', 'none', 'Name', 'State')
fullyConnectedLayer(3, 'Name', 'actorFC')
tanhLayer('Name', 'actorTanh')
fullyConnectedLayer(numActions, 'Name', 'Action')
];
actorOptions = rlRepresentationOptions('LearnRate',2e-04,'GradientThreshold',1);
actor = rlDeterministicActorRepresentation(actorNetwork,obsInfo,actInfo,...
'Observation',{'State'},'Action',{'Action'});
agentOptions = rlDDPGAgentOptions(...
'SampleTime',Ts,...
'TargetSmoothFactor',1e-3,...
'DiscountFactor',1.0, ...
'ExperienceBufferLength',1e6,...
'MiniBatchSize',64);
agentOptions.NoiseOptions.Variance = 0.3;
agentOptions.NoiseOptions.VarianceDecayRate = 1e-5;
agent = rlDDPGAgent(actor,critic,agentOptions);
%% 设置训练参数
maxepisodes = 2000;
maxsteps = ceil(Tf/Ts);
trainingOptions = rlTrainingOptions(...
'MaxEpisodes',maxepisodes,...
'MaxStepsPerEpisode',maxsteps,...
'ScoreAveragingWindowLength',5,...
'Verbose',false,...
'Plots','training-progress',...
'StopTrainingCriteria','AverageReward',...
'StopTrainingValue',800,...
'SaveAgentCriteria','EpisodeReward',...
'SaveAgentValue',-400);
%% 并行学习设置
trainOpts.UseParallel = true;
trainOpts.ParallelizationOptions.Mode = "async";
trainOpts.ParallelizationOptions.DataToSendFromWorkers = "Gradients";
trainOpts.ParallelizationOptions.StepsUntilDataIsSent = -1;
%% 训练
trainingStats = train(agent,env,trainingOptions);
%% 结果展示
simOptions = rlSimulationOptions('MaxSteps',500);
experience = sim(env,agent,simOptions);
totalReward = sum(experience.Reward);
% bdclose(mdl)
警告: 执行为类 rl.env.SimulinkEnvWithAgent 定义的事件 EpisodeFinished 的侦听程序回调时出现错误:
错误使用 rl.policy.AbstractPolicy/getAction (258 )
Invalid observation type or size.
出错 rl.agent.rlDDPGAgent/evaluateQ0Impl (141 )
action = getAction(this, observation);
出错 rl.agent.AbstractAgent/evaluateQ0 (275 )
q0 = evaluateQ0Impl(this,observation);
出错 rl.train.TrainingManager/update (134 )
q0 = evaluateQ0(this.Agents(idx),epinfo(idx).InitialObservation);
出错 rl.train.TrainingManager>@(info)update(this,info) (437 )
trainer.FinishedEpisodeFcn = @(info) update(this,info);
出错 rl.train.Trainer/notifyEpisodeFinishedAndCheckStopTrain (56 )
stopTraining = this.FinishedEpisodeFcn(info);
出错 rl.train.SeriesTrainer>iUpdateEpisodeFinished (31 )
notifyEpisodeFinishedAndCheckStopTrain(this,ed.Data);
出错 rl.train.SeriesTrainer>@(src,ed)iUpdateEpisodeFinished(this,ed) (17 )
@(src,ed) iUpdateEpisodeFinished(this,ed));
出错 rl.env.AbstractEnv/notifyEpisodeFinished (324 )
notify(this,'EpisodeFinished',ed);
出错 rl.env.SimulinkEnvWithAgent/executeSimsWrapper/nestedSimFinishedBC (222 )
notifyEpisodeFinished(this,...
出错 rl.env.SimulinkEnvWithAgent>@(src,ed)nestedSimFinishedBC(ed) (232 )
simlist(1) = event.listener(this.SimMgr,'SimulationFinished' ,@(src,ed) nestedSimFinishedBC(ed));
出错 Simulink.SimulationManager/handleSimulationOutputAvailable
出错 Simulink.SimulationManager>@(varargin)obj.handleSimulationOutputAvailable(varargin{:})
出错 MultiSim.internal.SimulationRunnerSerial/executeImplSingle
出错 MultiSim.internal.SimulationRunnerSerial/executeImpl
出错 Simulink.SimulationManager/executeSims
出错 Simulink.SimulationManagerEngine/executeSims
出错 rl.env.SimulinkEnvWithAgent/executeSimsWrapper (244 )
executeSims(this.SimEngine,simfh,in);
出错 rl.env.SimulinkEnvWithAgent/simWrapper (267 )
simouts = executeSimsWrapper(this,in,simfh,simouts,opts);
出错 rl.env.SimulinkEnvWithAgent/simWithPolicyImpl (424 )
simouts = simWrapper(env,policy,simData,in,opts);
出错 rl.env.AbstractEnv/simWithPolicy (82 )
[experiences,varargout{1:(nargout-1)}] = simWithPolicyImpl(this,policy,opts,varargin{:});
出错 rl.task.SeriesTrainTask/runImpl (33 )
[varargout{1},varargout{2}] = simWithPolicy(this.Env,this.Agent,simOpts);
出错 rl.task.Task/run (21 )
[varargout{1:nargout}] = runImpl(this);
出错 rl.task.TaskSpec/internal_run (166 )
[varargout{1:nargout}] = run(task);
出错 rl.task.TaskSpec/runDirect (170 )
[this.Outputs{1:getNumOutputs(this)}] = internal_run(this);
出错 rl.task.TaskSpec/runScalarTask (194 )
runDirect(this);
出错 rl.task.TaskSpec/run (69 )
runScalarTask(task);
出错 rl.train.SeriesTrainer/run (24 )
run(seriestaskspec);
出错 rl.train.TrainingManager/train (421 )
run(trainer);
出错 rl.train.TrainingManager/run (211 )
train(this);
出错 rl.agent.AbstractAgent/train (78 )
TrainingStatistics = run(trainMgr);
出错 simulink_pmsm_ddpg (75 )
trainingStats = train(agent,env,trainingOptions);
原因:
错误使用 rl.representation.rlAbstractRepresentation/validateInputData (525 )
Input data dimensions must match the dimensions specified in the corresponding observation and action info specifications.
> 位置:rl.env/AbstractEnv/notifyEpisodeFinished (324 )
位置: rl.env.SimulinkEnvWithAgent.executeSimsWrapper/nestedSimFinishedBC (222 )
位置: rl.env.SimulinkEnvWithAgent>@(src,ed)nestedSimFinishedBC(ed) (232 )
位置: Simulink/SimulationManager/handleSimulationOutputAvailable
位置: Simulink.SimulationManager>@(varargin)obj.handleSimulationOutputAvailable(varargin{:})
位置: MultiSim.internal/SimulationRunnerSerial/executeImplSingle
位置: MultiSim.internal/SimulationRunnerSerial/executeImpl
位置: Simulink/SimulationManager/executeSims
位置: Simulink/SimulationManagerEngine/executeSims
位置: rl.env/SimulinkEnvWithAgent/executeSimsWrapper (244 )
位置: rl.env/SimulinkEnvWithAgent/simWrapper (267 )
位置: rl.env/SimulinkEnvWithAgent/simWithPolicyImpl (424 )
位置: rl.env/AbstractEnv/simWithPolicy (82 )
位置: rl.task/SeriesTrainTask/runImpl (33 )
位置: rl.task/Task/run (21 )
位置: rl.task/TaskSpec/internal_run (166 )
位置: rl.task/TaskSpec/runDirect (170 )
位置: rl.task/TaskSpec/runScalarTask (194 )
位置: rl.task/TaskSpec/run (69 )
位置: rl.train/SeriesTrainer/run (24 )
位置: rl.train/TrainingManager/train (421 )
位置: rl.train/TrainingManager/run (211 )
位置: rl.agent.AbstractAgent/train (78 )
位置: simulink_pmsm_ddpg (75 )
错误使用 rl.env.AbstractEnv/simWithPolicy (82 )
An error occurred while simulating "rlpmsmSimscapeModel" with the agent "agent".
出错 rl.task.SeriesTrainTask/runImpl (33 )
[varargout{1},varargout{2}] = simWithPolicy(this.Env,this.Agent,simOpts);
出错 rl.task.Task/run (21 )
[varargout{1:nargout}] = runImpl(this);
出错 rl.task.TaskSpec/internal_run (166 )
[varargout{1:nargout}] = run(task);
出错 rl.task.TaskSpec/runDirect (170 )
[this.Outputs{1:getNumOutputs(this)}] = internal_run(this);
出错 rl.task.TaskSpec/runScalarTask (194 )
runDirect(this);
出错 rl.task.TaskSpec/run (69 )
runScalarTask(task);
出错 rl.train.SeriesTrainer/run (24 )
run(seriestaskspec);
出错 rl.train.TrainingManager/train (421 )
run(trainer);
出错 rl.train.TrainingManager/run (211 )
train(this);
出错 rl.agent.AbstractAgent/train (78 )
TrainingStatistics = run(trainMgr);
出错 simulink_pmsm_ddpg (75 )
trainingStats = train(agent,env,trainingOptions);
原因:
错误使用 rl.env.SimulinkEnvWithAgent>localHandleSimoutErrors (681 )
Invalid observation type or size.
错误使用 rl.env.SimulinkEnvWithAgent>localHandleSimoutErrors (681 )
Input data dimensions must match the dimensions specified in the corresponding observation and action info specifications.

张 冠宇
张 冠宇 il 13 Nov 2021
我也遇到了这个问题,请问您现在解决了吗
  4 Commenti
邓龙京
邓龙京 il 26 Apr 2024
你好,请问您现在解决了吗,我也遇到了这个问题
成奥
成奥 il 21 Mag 2024
您好 请问您解决了么?我也遇到了这个问题,
WX:13220788757

Accedi per commentare.


jiangang
jiangang il 22 Lug 2022
我也遇到了这个问题 大佬求解

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!