When I use TCN+DDPG algorithm ,an error was reported at: trainingStats = train(agen​t,env,trai​nOpts); : There was an error executing the ProcessExperienceFcn.

7 visualizzazioni (ultimi 30 giorni)
The following is the complete code, the simple DDPG algorithm was solved without problems, but when I added a causal convolution layer to the network, the error was reported
The reason:
Variables of this type do not support indexing with curly braces.
I don't know what went wrong and how to fix it
%% 主程序
close all
clear
clc
%% Data set
[DataSet, Para, Case_data, Bid_data, Res_data] = Datainitial();
%% Action definition
% creat action
% action description
% a = [bid_t]
action_num = Para.res_num * 2 - 2;
ActionInfo = rlNumericSpec([action_num,1]); %
% ActionInfo = rlFiniteSetSpec(action_num);
ActionInfo.LowerLimit = -1 * ones(action_num,1);
ActionInfo.UpperLimit = 1 * ones(action_num,1);
ActionInfo.Name = 'Market Bid Action';
ActionInfo.Description = 'details are in ParaSet';
%% Observation definition
% create state
% state description
% s = [time, demand t-1, demand t, price t-1, generation t-1]
state_num = 6;
ObservationInfo = rlNumericSpec([state_num 1]);
% ObservationInfo.LowerLimit = -ones(state_num,1);
% ObservationInfo.UpperLimit = ones(state_num,1);
ObservationInfo.Name = 'Market States';
ObservationInfo.Description = 'details are in ParaSet';
%% Reset and step function creation
StepHandle = @(Action,SavedData)myStepFunction(Action,SavedData,Para,Case_data, Bid_data, Res_data);
ResetHandle = @()myResetFunction(DataSet, Para);
env = rlFunctionEnv(ObservationInfo,ActionInfo,StepHandle,ResetHandle);
rng(0)
%% CRITIC
criticLayerSizes = [200 100];
actorLayerSizes = [100 100];
numObs = state_num;
% Define path for the state input
statePath = [
sequenceInputLayer(prod(ObservationInfo.Dimension),Name="NetObsInLayer")
%featureInputLayer(prod(ObservationInfo.Dimension),Name="NetObsInLayer")
fullyConnectedLayer(128)
reluLayer
fullyConnectedLayer(200,Name="sPathOut")];
% Define path for the action input
actionPath = [
sequenceInputLayer(prod(ActionInfo.Dimension),Name="NetActInLayer")
%featureInputLayer(prod(ActionInfo.Dimension),Name="NetActInLayer")
fullyConnectedLayer(200,Name="aPathOut",BiasLearnRateFactor=0)];
% Define path for the critic output (value)
commonPath = [
additionLayer(2,Name="add")
convolution1dLayer(3, 16, padding = "casual") % 一维卷积层
layerNormalizationLayer % 层归一化
convolution1dLayer(16, 16, padding = "casual")
layerNormalizationLayer % 层归一化
reluLayer
%reluLayer
fullyConnectedLayer(1,Name="CriticOutput")];
% Create layerGraph object and add layers
criticNetwork = layerGraph(statePath);
criticNetwork = addLayers(criticNetwork,actionPath);
criticNetwork = addLayers(criticNetwork,commonPath);
% Connect paths and convert to dlnetwork object
criticNetwork = connectLayers(criticNetwork,"sPathOut","add/in1");
criticNetwork = connectLayers(criticNetwork,"aPathOut","add/in2");
criticNetwork = dlnetwork(criticNetwork);
% criticOpts = rlOptimizerOptions('LearnRate',1e-4);
% critic = rlValueFunction(criticNetwork,ObservationInfo);
critic = rlQValueFunction(criticNetwork, ...
ObservationInfo,ActionInfo,...
ObservationInputNames="NetObsInLayer", ...
ActionInputNames="NetActInLayer");
%% ACTOR
numAct = action_num;
actorNet = [
sequenceInputLayer(prod(ObservationInfo.Dimension))
%featureInputLayer(prod(ObservationInfo.Dimension))
convolution1dLayer(3, 16, padding = "casual") % 一维卷积层
layerNormalizationLayer % 层归一化
convolution1dLayer(16, 16, padding = "casual")
layerNormalizationLayer % 层归一化
reluLayer
fullyConnectedLayer(300)
reluLayer
fullyConnectedLayer(200)
reluLayer
fullyConnectedLayer(prod(ActionInfo.Dimension))
tanhLayer
scalingLayer(Scale=max(ActionInfo.UpperLimit))];
% Plot network
% plot(actorNet)
actorNetwork = dlnetwork(actorNet);
% actorOpts = rlOptimizerOptions('LearnRate',1e-4);
% actor = rlContinuousGaussianActor(actorNetwork,ObservationInfo,ActionInfo,...
% ActionMeanOutputNames="meanPathOut",...
% ActionStandardDeviationOutputNames="stdPathOut",...
% ObservationInputNames="comPathIn");
actor = rlContinuousDeterministicActor(actorNetwork,ObservationInfo,ActionInfo);
% act = getAction(actor,{rand(ObservationInfo.Dimension)}); % action check
%% Agent
criticOpts = rlOptimizerOptions(LearnRate=1e-03,GradientThreshold=1);
actorOpts = rlOptimizerOptions(LearnRate=5e-04,GradientThreshold=1);
agentOpts = rlDDPGAgentOptions(...
'ActorOptimizerOptions',actorOpts,...
'CriticOptimizerOptions',criticOpts,...
'ExperienceBufferLength', 3000,...
'SampleTime',1,...
'DiscountFactor',0.95,...
"MiniBatchSize",200 ...
);
agentOpts.SequenceLength = 5;
agentOpts.NoiseOptions.Variance = 0.4; % 0.4
agentOpts.NoiseOptions.VarianceDecayRate = 1e-5;
agent = rlDDPGAgent(actor,critic,agentOpts);
trainOpts = rlTrainingOptions(...
'MaxEpisodes',2000,...
'MaxStepsPerEpisode',300,...
'Plots','training-progress',...
'StopTrainingCriteria','AverageReward',...
'StopTrainingValue',5000,...
'ScoreAveragingWindowLength',30,...
'SaveAgentCriteria',"EpisodeReward",...
'SaveAgentValue',2500);
trainOpts.UseParallel = false;
trainOpts.ParallelizationOptions.StepsUntilDataIsSent = 30;
trainOpts.ParallelizationOptions.Mode = 'async';
%% train
doTraining = 1;
if doTraining
% Train the agent.
trainingStats = train(agent,env,trainOpts);
day = char(datetime, 'MMdd-HHmm');
file = ['TD3agent-realtime-response-SCED', day];
tips = 'both taker and maker are included in the market';
save(file,'agent','tips',"trainingStats");
else
% Load the pretrained agent for the example.
% load('Agent8803.mat','saved_agent');
load('agent-Q1-1009-1.mat','agent');
end
%% Simulation Reset and step function creation
% DataSet = sequence(2:end,:);
% Para.test = 1;
StepHandleT = @(Action,SavedData)myStepFunction(Action,SavedData,Para,Case_data, Bid_data, Res_data);
ResetHandleT = @()myResetFunction(DataSet,Para);
% Simulation environment
envT = rlFunctionEnv(ObservationInfo,ActionInfo,StepHandleT,ResetHandleT);
% Simulation on Test data
simOpts = rlSimulationOptions('MaxSteps',30);
for i = 1
experience = sim(envT,agent,simOpts);
ActionData(:,i) = experience.Action.MarketBidAction.Data(:);
end
% ActionData = reshape(ActionData,2,[]);

Risposte (1)

Aditya
Aditya il 20 Mar 2024
The error message you're encountering, "Variables of this type do not support indexing with curly braces," is likely related to how you're specifying options or parameters in your network layers. Specifically, the issue might be arising from how you're defining the padding in your convolutional layers. In MATLAB, when setting layer options, you typically use name-value pair arguments rather than using an equals sign (=).
For the convolutional layers where you have specified padding = "casual", MATLAB is interpreting the = as an attempt to index into a variable with curly braces {}, which is not supported for the types being used here. The correct way to specify the padding option in your convolutional layers is by using a name-value pair, without the equals sign. Additionally, the correct term for padding in MATLAB's deep learning toolbox is "causal" not "casual".
Here's the corrected portion of your code for defining convolutional layers with causal padding:
commonPath = [
additionLayer(2,Name="add")
convolution1dLayer(3, 16, 'Padding', "causal") % corrected padding specification
layerNormalizationLayer
convolution1dLayer(16, 16, 'Padding', "causal") % corrected padding specification
layerNormalizationLayer
reluLayer
fullyConnectedLayer(1,Name="CriticOutput")];
And similarly, for the actor network:
actorNet = [
sequenceInputLayer(prod(ObservationInfo.Dimension))
convolution1dLayer(3, 16, 'Padding', "causal") % corrected padding specification
layerNormalizationLayer
convolution1dLayer(16, 16, 'Padding', "causal") % corrected padding specification
layerNormalizationLayer
fullyConnectedLayer(300)
reluLayer
fullyConnectedLayer(200)
reluLayer
fullyConnectedLayer(prod(ActionInfo.Dimension))
tanhLayer
scalingLayer('Scale',max(ActionInfo.UpperLimit))]; % Corrected scale specification

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by