How to solve "Invalid input argument type or size such as observation, reward, isdone or loggedSignals."

4 visualizzazioni (ultimi 30 giorni)
Using Carmaker for simulink, we are constructing a reinforcement learning environment
The following error is occurring
--------------------------------------------------------------------------------------------------------------------------
" 다음 사용 중 오류가 발생함: rl.env.SimulinkEnvWithAgent>localHandleSimoutErrors
(667번 라인)
Invalid input argument type or size such as observation, reward, isdone or
loggedSignals.
다음 사용 중 오류가 발생함: rl.env.SimulinkEnvWithAgent>localHandleSimoutErrors
(667번 라인)
Unable to compute gradient from representation.
다음 사용 중 오류가 발생함:
rl.env.SimulinkEnvWithAgent>localHandleSimoutErrors (667번 라인)
요소 개수는 변경되어서는 안 됩니다. 해당 차원에 대한 적절한 크기를 자동으로 계산하려면
크기 입력값 중 하나로 []을 사용하십시오. "
--------------------------------------------------------------------------------------------------------------------------
I checked image input part because I am using the plot image as the input value of the network
I checked the size of the image that goes into the input, but I checked that it remains at 128 * 128 * 3.
I think there have been a problem with creating a deep learning network.
mdl = 'generic';
open_system(mdl);
agentblk = [mdl '/CarMaker/VehicleControl/CreateBus VhclCtrl/RL Agent'];
% create the observation info
obsInfo = rlNumericSpec([128 128 3],'LowerLimit',-inf*ones(128,128,3),'UpperLimit',inf*ones(128,128,3),'DataType', 'uint8');
obsInfo.Name = 'observations';
obsInfo.Description = 'information on velocity error and ego velocity';
% action Info
actInfo = rlNumericSpec([1 1 1],'LowerLimit',-45,'UpperLimit',45);
actInfo.Name = 'SteeringAng';
% define environment
env = rlSimulinkEnv(mdl,agentblk,obsInfo,actInfo);
% 신경망의 계층을 담을 계층 그래프 변수
lgraph = layerGraph();
% Critic 학습을 위한 Action 불러오는 부분
ActPath = [
imageInputLayer([1 1 1],"Name","action","Normalization","none")
fullyConnectedLayer(300,"Name","fc3","BiasLearnRateFactor",0)];
lgraph = addLayers(lgraph,ActPath);
% Critic 학습을 위한 Image 불러오는 부분
StatePath = [
imageInputLayer([128 128 3],"Name","LidarImage","Normalization","none")
convolution2dLayer([3 3],32,"Name","conv1","Padding","same")
reluLayer("Name","relu1")
maxPooling2dLayer([3 3],"Name","maxpool1","Padding","same","Stride",[2 2])
convolution2dLayer([3 3],32,"Name","conv2","Padding","same")
reluLayer("Name","relu2")
maxPooling2dLayer([3 3],"Name","maxpool2","Padding","same","Stride",[2 2])
fullyConnectedLayer(400,"Name","fc1")
reluLayer("Name","relu3")
fullyConnectedLayer(300,"Name","fc2")];
lgraph = addLayers(lgraph,StatePath);
tempLayers = [
additionLayer(2,"Name","add")
reluLayer("Name","relu4")
fullyConnectedLayer(1,"Name","fc4")];
lgraph = addLayers(lgraph,tempLayers);
lgraph = connectLayers(lgraph,"fc3","add/in2");
lgraph = connectLayers(lgraph,"fc2","add/in1");
% Assemble paths
criticNetwork = lgraph;
% Create critic representation
criticOptions = rlRepresentationOptions('LearnRate',1e-03,'GradientThreshold',1);
criticOptions.UseDevice = 'gpu';
critic = rlQValueRepresentation(criticNetwork,obsInfo,actInfo,...
'Observation',{'LidarImage'},'Action',{'action'},criticOptions);
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
layers = [
imageInputLayer([128 128 3],"Name","LidarImage","Normalization","none")
convolution2dLayer([3 3],32,"Name","conv1","Padding","same")
reluLayer("Name","relu1")
maxPooling2dLayer([3 3],"Name","maxpool1","Padding","same","Stride",[2 2])
convolution2dLayer([3 3],32,"Name","conv2","Padding","same")
reluLayer("Name","relu2")
maxPooling2dLayer([3 3],"Name","maxpool2","Padding","same","Stride",[2 2])
fullyConnectedLayer(400,"Name","fc1")
reluLayer("Name","relu3")
fullyConnectedLayer(300,"Name","fc2")
reluLayer("Name","relu4")
fullyConnectedLayer(1,"Name","fc4")
tanhLayer("Name","tanh")
scalingLayer("Name","scale1","Scale",actInfo.UpperLimit)];
actorNetwork = layerGraph(layers);
actorOptions = rlRepresentationOptions('LearnRate',1e-04,'GradientThreshold',1);
actorOptions.UseDevice = 'gpu';
actor = rlDeterministicActorRepresentation(actorNetwork,obsInfo,actInfo,'Observation',{'LidarImage'},'Action',{'scale1'},actorOptions);
% rlDDPGAgentOptions Options
agentOptions = rlDDPGAgentOptions(...
'SampleTime',0.01,...
'TargetSmoothFactor',1e-3,...
'ExperienceBufferLength',1e6,...
'DiscountFactor',0.99,...
'MiniBatchSize',128);
agentOptions.NoiseOptions.Variance = 0.6;
agentOptions.NoiseOptions.VarianceDecayRate = 1e-6;
agent = rlDDPGAgent(actor,critic,agentOptions);
% Train Agent
maxepisodes = 5000;
maxsteps = 400;
trainingOptions = rlTrainingOptions(...
'MaxEpisodes',maxepisodes,...
'MaxStepsPerEpisode',maxsteps,...
'Plots','training-progress',...
'StopTrainingCriteria','AverageReward',...
'StopTrainingValue',1000);
% doTraining = true;
% if doTraining
% % Train the agent.
% trainingStats = train(agent,env,trainingOptions);
% else
% % Load pretrained agent for the example.
% load('SimplePendulumWithImageDDPG.mat','agent')
% end

Risposte (1)

Angelo Yeo
Angelo Yeo il 21 Apr 2023
시뮬링크 모델이 첨부되어 있지 않기 때문에 문제 재현은 불가능합니다만, 에러 메시지를 보았을 때 아래의 사항에 대해 고려해볼 수 있을 것으로 생각됩니다.
  1. "useDevice" 옵션을 "gpu"에서 "cpu"로 변경 시 문제가 해결되는지 확인하여 주십시오.
  2. RL agent에게 주어지는 reward의 data type이 중간에 변경되는 것은 아닌지 확인하여 주십시오.
  1 Commento
jihun Kim
jihun Kim il 22 Apr 2023
  1. I change from gpu to cpu, the same error is occurring.
  2. The data type of the reward does not seem to be problematic.
I attached the simulation link model, could you check it?

Accedi per commentare.

Categorie

Scopri di più su Training and Simulation in Help Center e File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!