When I use TCN+DDPG algorithm ，an error was reported at: trainingStats = train(agent,env,trainOpts); : There was an error executing the ProcessExperienceFcn.

Question

yuchen il 1 Mar 2024

0
Link

Link diretto a questa domanda

https://it.mathworks.com/matlabcentral/answers/2089036-when-i-use-tcn-ddpg-algorithm-an-error-was-reported-at-trainingstats-train-agent-env-trainopts

Risposto: Aditya il 20 Mar 2024

The following is the complete code, the simple DDPG algorithm was solved without problems, but when I added a causal convolution layer to the network, the error was reported

The reason:

Variables of this type do not support indexing with curly braces.

I don't know what went wrong and how to fix it

%% 主程序

close all

clear

clc

%% Data set

[DataSet, Para, Case_data, Bid_data, Res_data] = Datainitial();

%% Action definition

% creat action

% action description

% a = [bid_t]

action_num = Para.res_num * 2 - 2;

ActionInfo = rlNumericSpec([action_num,1]); %

% ActionInfo = rlFiniteSetSpec(action_num);

ActionInfo.LowerLimit = -1 * ones(action_num,1);

ActionInfo.UpperLimit = 1 * ones(action_num,1);

ActionInfo.Name = 'Market Bid Action';

ActionInfo.Description = 'details are in ParaSet';

%% Observation definition

% create state

% state description

% s = [time, demand t-1, demand t, price t-1, generation t-1]

state_num = 6;

ObservationInfo = rlNumericSpec([state_num 1]);

% ObservationInfo.LowerLimit = -ones(state_num,1);

% ObservationInfo.UpperLimit = ones(state_num,1);

ObservationInfo.Name = 'Market States';

ObservationInfo.Description = 'details are in ParaSet';

%% Reset and step function creation

StepHandle = @(Action,SavedData)myStepFunction(Action,SavedData,Para,Case_data, Bid_data, Res_data);

ResetHandle = @()myResetFunction(DataSet, Para);

env = rlFunctionEnv(ObservationInfo,ActionInfo,StepHandle,ResetHandle);

rng(0)

%% CRITIC

criticLayerSizes = [200 100];

actorLayerSizes = [100 100];

numObs = state_num;

% Define path for the state input

statePath = [

sequenceInputLayer(prod(ObservationInfo.Dimension),Name="NetObsInLayer")

%featureInputLayer(prod(ObservationInfo.Dimension),Name="NetObsInLayer")

fullyConnectedLayer(128)

reluLayer

fullyConnectedLayer(200,Name="sPathOut")];

% Define path for the action input

actionPath = [

sequenceInputLayer(prod(ActionInfo.Dimension),Name="NetActInLayer")

%featureInputLayer(prod(ActionInfo.Dimension),Name="NetActInLayer")

fullyConnectedLayer(200,Name="aPathOut",BiasLearnRateFactor=0)];

% Define path for the critic output (value)

commonPath = [

additionLayer(2,Name="add")

convolution1dLayer(3, 16, padding = "casual") % 一维卷积层

layerNormalizationLayer % 层归一化

convolution1dLayer(16, 16, padding = "casual")

layerNormalizationLayer % 层归一化

reluLayer

%reluLayer

fullyConnectedLayer(1,Name="CriticOutput")];

% Create layerGraph object and add layers

criticNetwork = layerGraph(statePath);

criticNetwork = addLayers(criticNetwork,actionPath);

criticNetwork = addLayers(criticNetwork,commonPath);

% Connect paths and convert to dlnetwork object

criticNetwork = connectLayers(criticNetwork,"sPathOut","add/in1");

criticNetwork = connectLayers(criticNetwork,"aPathOut","add/in2");

criticNetwork = dlnetwork(criticNetwork);

% criticOpts = rlOptimizerOptions('LearnRate',1e-4);

% critic = rlValueFunction(criticNetwork,ObservationInfo);

critic = rlQValueFunction(criticNetwork, ...

ObservationInfo,ActionInfo,...

ObservationInputNames="NetObsInLayer", ...

ActionInputNames="NetActInLayer");

%% ACTOR

numAct = action_num;

actorNet = [

sequenceInputLayer(prod(ObservationInfo.Dimension))

%featureInputLayer(prod(ObservationInfo.Dimension))

convolution1dLayer(3, 16, padding = "casual") % 一维卷积层

layerNormalizationLayer % 层归一化

convolution1dLayer(16, 16, padding = "casual")

layerNormalizationLayer % 层归一化

reluLayer

fullyConnectedLayer(300)

reluLayer

fullyConnectedLayer(200)

reluLayer

fullyConnectedLayer(prod(ActionInfo.Dimension))

tanhLayer

scalingLayer(Scale=max(ActionInfo.UpperLimit))];

% Plot network

% plot(actorNet)

actorNetwork = dlnetwork(actorNet);

% actorOpts = rlOptimizerOptions('LearnRate',1e-4);

% actor = rlContinuousGaussianActor(actorNetwork,ObservationInfo,ActionInfo,...

% ActionMeanOutputNames="meanPathOut",...

% ActionStandardDeviationOutputNames="stdPathOut",...

% ObservationInputNames="comPathIn");

actor = rlContinuousDeterministicActor(actorNetwork,ObservationInfo,ActionInfo);

% act = getAction(actor,{rand(ObservationInfo.Dimension)}); % action check

%% Agent

criticOpts = rlOptimizerOptions(LearnRate=1e-03,GradientThreshold=1);

actorOpts = rlOptimizerOptions(LearnRate=5e-04,GradientThreshold=1);

agentOpts = rlDDPGAgentOptions(...

'ActorOptimizerOptions',actorOpts,...

'CriticOptimizerOptions',criticOpts,...

'ExperienceBufferLength', 3000,...

'SampleTime',1,...

'DiscountFactor',0.95,...

"MiniBatchSize",200 ...

);

agentOpts.SequenceLength = 5;

agentOpts.NoiseOptions.Variance = 0.4; % 0.4

agentOpts.NoiseOptions.VarianceDecayRate = 1e-5;

agent = rlDDPGAgent(actor,critic,agentOpts);

trainOpts = rlTrainingOptions(...

'MaxEpisodes',2000,...

'MaxStepsPerEpisode',300,...

'Plots','training-progress',...

'StopTrainingCriteria','AverageReward',...

'StopTrainingValue',5000,...

'ScoreAveragingWindowLength',30,...

'SaveAgentCriteria',"EpisodeReward",...

'SaveAgentValue',2500);

trainOpts.UseParallel = false;

trainOpts.ParallelizationOptions.StepsUntilDataIsSent = 30;

trainOpts.ParallelizationOptions.Mode = 'async';

%% train

doTraining = 1;

if doTraining

% Train the agent.

trainingStats = train(agent,env,trainOpts);

day = char(datetime, 'MMdd-HHmm');

file = ['TD3agent-realtime-response-SCED', day];

tips = 'both taker and maker are included in the market';

save(file,'agent','tips',"trainingStats");

else

% Load the pretrained agent for the example.

% load('Agent8803.mat','saved_agent');

load('agent-Q1-1009-1.mat','agent');

end

%% Simulation Reset and step function creation

% DataSet = sequence(2:end,:);

% Para.test = 1;

StepHandleT = @(Action,SavedData)myStepFunction(Action,SavedData,Para,Case_data, Bid_data, Res_data);

ResetHandleT = @()myResetFunction(DataSet,Para);

% Simulation environment

envT = rlFunctionEnv(ObservationInfo,ActionInfo,StepHandleT,ResetHandleT);

% Simulation on Test data

simOpts = rlSimulationOptions('MaxSteps',30);

for i = 1

experience = sim(envT,agent,simOpts);

ActionData(:,i) = experience.Action.MarketBidAction.Data(:);

end

% ActionData = reshape(ActionData,2,[]);

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

Accedi per rispondere a questa domanda.

Answer 1

Aditya il 20 Mar 2024

0
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/2089036-when-i-use-tcn-ddpg-algorithm-an-error-was-reported-at-trainingstats-train-agent-env-trainopts#answer_1428036

Apri in MATLAB Online

The error message you're encountering, "Variables of this type do not support indexing with curly braces," is likely related to how you're specifying options or parameters in your network layers. Specifically, the issue might be arising from how you're defining the padding in your convolutional layers. In MATLAB, when setting layer options, you typically use name-value pair arguments rather than using an equals sign (=).

For the convolutional layers where you have specified padding = "casual", MATLAB is interpreting the = as an attempt to index into a variable with curly braces {}, which is not supported for the types being used here. The correct way to specify the padding option in your convolutional layers is by using a name-value pair, without the equals sign. Additionally, the correct term for padding in MATLAB's deep learning toolbox is "causal" not "casual".

Here's the corrected portion of your code for defining convolutional layers with causal padding:

commonPath = [
    additionLayer(2,Name="add")                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
    convolution1dLayer(3, 16, 'Padding', "causal")  % corrected padding specification
    layerNormalizationLayer                        
    convolution1dLayer(16, 16, 'Padding', "causal") % corrected padding specification
    layerNormalizationLayer                        
    reluLayer
    fullyConnectedLayer(1,Name="CriticOutput")];

And similarly, for the actor network:

actorNet = [
    sequenceInputLayer(prod(ObservationInfo.Dimension))
    convolution1dLayer(3, 16, 'Padding', "causal")  % corrected padding specification
    layerNormalizationLayer                        
    convolution1dLayer(16, 16, 'Padding', "causal") % corrected padding specification
    layerNormalizationLayer                        
    fullyConnectedLayer(300)
    reluLayer
    fullyConnectedLayer(200)
    reluLayer
    fullyConnectedLayer(prod(ActionInfo.Dimension))
    tanhLayer
    scalingLayer('Scale',max(ActionInfo.UpperLimit))];  % Corrected scale specification