How to solve "Invalid input argument type or size such as observation, reward, isdone or loggedSignals."

3 visualizzazioni (ultimi 30 giorni)
Using Carmaker for simulink, we are constructing a reinforcement learning environment
The following error is occurring
--------------------------------------------------------------------------------------------------------------------------
" 다음 사용 중 오류가 발생함: rl.env.SimulinkEnvWithAgent>localHandleSimoutErrors
(667번 라인)
Invalid input argument type or size such as observation, reward, isdone or
loggedSignals.
다음 사용 중 오류가 발생함: rl.env.SimulinkEnvWithAgent>localHandleSimoutErrors
(667번 라인)
Unable to compute gradient from representation.
다음 사용 중 오류가 발생함:
rl.env.SimulinkEnvWithAgent>localHandleSimoutErrors (667번 라인)
요소 개수는 변경되어서는 안 됩니다. 해당 차원에 대한 적절한 크기를 자동으로 계산하려면
크기 입력값 중 하나로 []을 사용하십시오. "
--------------------------------------------------------------------------------------------------------------------------
I checked image input part because I am using the plot image as the input value of the network
I checked the size of the image that goes into the input, but I checked that it remains at 128 * 128 * 3.
I think there have been a problem with creating a deep learning network.
mdl = 'generic';
open_system(mdl);
agentblk = [mdl '/CarMaker/VehicleControl/CreateBus VhclCtrl/RL Agent'];
% create the observation info
obsInfo = rlNumericSpec([128 128 3],'LowerLimit',-inf*ones(128,128,3),'UpperLimit',inf*ones(128,128,3),'DataType', 'uint8');
obsInfo.Name = 'observations';
obsInfo.Description = 'information on velocity error and ego velocity';
% action Info
actInfo = rlNumericSpec([1 1 1],'LowerLimit',-45,'UpperLimit',45);
actInfo.Name = 'SteeringAng';
% define environment
env = rlSimulinkEnv(mdl,agentblk,obsInfo,actInfo);
% 신경망의 계층을 담을 계층 그래프 변수
lgraph = layerGraph();
% Critic 학습을 위한 Action 불러오는 부분
ActPath = [
imageInputLayer([1 1 1],"Name","action","Normalization","none")
fullyConnectedLayer(300,"Name","fc3","BiasLearnRateFactor",0)];
lgraph = addLayers(lgraph,ActPath);
% Critic 학습을 위한 Image 불러오는 부분
StatePath = [
imageInputLayer([128 128 3],"Name","LidarImage","Normalization","none")
convolution2dLayer([3 3],32,"Name","conv1","Padding","same")
reluLayer("Name","relu1")
maxPooling2dLayer([3 3],"Name","maxpool1","Padding","same","Stride",[2 2])
convolution2dLayer([3 3],32,"Name","conv2","Padding","same")
reluLayer("Name","relu2")
maxPooling2dLayer([3 3],"Name","maxpool2","Padding","same","Stride",[2 2])
fullyConnectedLayer(400,"Name","fc1")
reluLayer("Name","relu3")
fullyConnectedLayer(300,"Name","fc2")];
lgraph = addLayers(lgraph,StatePath);
tempLayers = [
additionLayer(2,"Name","add")
reluLayer("Name","relu4")
fullyConnectedLayer(1,"Name","fc4")];
lgraph = addLayers(lgraph,tempLayers);
lgraph = connectLayers(lgraph,"fc3","add/in2");
lgraph = connectLayers(lgraph,"fc2","add/in1");
% Assemble paths
criticNetwork = lgraph;
% Create critic representation
criticOptions = rlRepresentationOptions('LearnRate',1e-03,'GradientThreshold',1);
criticOptions.UseDevice = 'gpu';
critic = rlQValueRepresentation(criticNetwork,obsInfo,actInfo,...
'Observation',{'LidarImage'},'Action',{'action'},criticOptions);
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
layers = [
imageInputLayer([128 128 3],"Name","LidarImage","Normalization","none")
convolution2dLayer([3 3],32,"Name","conv1","Padding","same")
reluLayer("Name","relu1")
maxPooling2dLayer([3 3],"Name","maxpool1","Padding","same","Stride",[2 2])
convolution2dLayer([3 3],32,"Name","conv2","Padding","same")
reluLayer("Name","relu2")
maxPooling2dLayer([3 3],"Name","maxpool2","Padding","same","Stride",[2 2])
fullyConnectedLayer(400,"Name","fc1")
reluLayer("Name","relu3")
fullyConnectedLayer(300,"Name","fc2")
reluLayer("Name","relu4")
fullyConnectedLayer(1,"Name","fc4")
tanhLayer("Name","tanh")
scalingLayer("Name","scale1","Scale",actInfo.UpperLimit)];
actorNetwork = layerGraph(layers);
actorOptions = rlRepresentationOptions('LearnRate',1e-04,'GradientThreshold',1);
actorOptions.UseDevice = 'gpu';
actor = rlDeterministicActorRepresentation(actorNetwork,obsInfo,actInfo,'Observation',{'LidarImage'},'Action',{'scale1'},actorOptions);
% rlDDPGAgentOptions Options
agentOptions = rlDDPGAgentOptions(...
'SampleTime',0.01,...
'TargetSmoothFactor',1e-3,...
'ExperienceBufferLength',1e6,...
'DiscountFactor',0.99,...
'MiniBatchSize',128);
agentOptions.NoiseOptions.Variance = 0.6;
agentOptions.NoiseOptions.VarianceDecayRate = 1e-6;
agent = rlDDPGAgent(actor,critic,agentOptions);
% Train Agent
maxepisodes = 5000;
maxsteps = 400;
trainingOptions = rlTrainingOptions(...
'MaxEpisodes',maxepisodes,...
'MaxStepsPerEpisode',maxsteps,...
'Plots','training-progress',...
'StopTrainingCriteria','AverageReward',...
'StopTrainingValue',1000);
% doTraining = true;
% if doTraining
% % Train the agent.
% trainingStats = train(agent,env,trainingOptions);
% else
% % Load pretrained agent for the example.
% load('SimplePendulumWithImageDDPG.mat','agent')
% end

Risposte (1)

Angelo Yeo
Angelo Yeo il 21 Apr 2023
시뮬링크 모델이 첨부되어 있지 않기 때문에 문제 재현은 불가능합니다만, 에러 메시지를 보았을 때 아래의 사항에 대해 고려해볼 수 있을 것으로 생각됩니다.
  1. "useDevice" 옵션을 "gpu"에서 "cpu"로 변경 시 문제가 해결되는지 확인하여 주십시오.
  2. RL agent에게 주어지는 reward의 data type이 중간에 변경되는 것은 아닌지 확인하여 주십시오.
  1 Commento
jihun Kim
jihun Kim il 22 Apr 2023
  1. I change from gpu to cpu, the same error is occurring.
  2. The data type of the reward does not seem to be problematic.
I attached the simulation link model, could you check it?

Accedi per commentare.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!