Multi action agent programming in reinforcement learning

Question

0 voti

Please, how can I program or represent multi action agent in reinforcement learning (DQN), where I could construct the agent but I do not know how can represent it (action with three decision every stage of learning) in step function. The action has three decision that are charging battery, operating first generator and operating second generator. The first part of code below show how I construct the enviroment and in the second part I ask how can I add this actions to the my step function.

Thank you in advance.

first part

clc

ObservationInfo = rlNumericSpec([4 1]);

ObservationInfo.Name = 'EnergSolar States';

ObservationInfo.Description = 'T,SOC,SOF,Temp';

ActionInfo = rlFiniteSetSpec({[-1 0 0],[-1 1 0],[-1 0 1],[-1 1 1],[0 0 0],[0 1 0],[0 0 1],[0 1 1],[1 0 0],[1 1 0],[1 0 1],[1 1 1]});

ActionInfo.Name = 'EnergSolar Action';

env = rlFunctionEnv(ObservationInfo,ActionInfo,'myStepFunctionfuel','myResetFunctionfuel');

obsInfo = getObservationInfo(env);

numObservations = obsInfo.Dimension(1);

actInfo = getActionInfo(env);

statePath = [

imageInputLayer([4 1 1], 'Normalization', 'none', 'Name', 'state')

fullyConnectedLayer(200, 'Name', 'CriticStateFC1')

reluLayer('Name', 'CriticRelu1')

fullyConnectedLayer(200, 'Name', 'CriticStateFC2')];

actionPath = [

imageInputLayer([1 3 1], 'Normalization', 'none', 'Name', 'action')

fullyConnectedLayer(200, 'Name', 'CriticActionFC1')];

commonPath = [

additionLayer(2,'Name', 'add')

reluLayer('Name','CriticCommonRelu')

fullyConnectedLayer(1, 'Name', 'output')];

criticNetwork = layerGraph(statePath);

criticNetwork = addLayers(criticNetwork, actionPath);

criticNetwork = addLayers(criticNetwork, commonPath);

criticNetwork = connectLayers(criticNetwork,'CriticStateFC2','add/in1');

criticNetwork = connectLayers(criticNetwork,'CriticActionFC1','add/in2');

criticOpts = rlRepresentationOptions('LearnRate',0.002,'GradientThreshold',1);

critic = rlRepresentation(criticNetwork,obsInfo,actInfo,...

'Observation',{'state'},'Action',{'action'},criticOpts);

agentOpts = rlDQNAgentOptions(...

'UseDoubleDQN',false, ...

'TargetUpdateMethod',"periodic", ...

'TargetUpdateFrequency',4, ...

'ExperienceBufferLength',100000, ...

'DiscountFactor',0.99, ...

'MiniBatchSize',1000);%500 to 1000

agent = rlDQNAgent(critic,agentOpts);

trainOpts = rlTrainingOptions(...

'MaxEpisodes', 1000, ...

'MaxStepsPerEpisode', 500, ...

'Verbose', false, ...

'Plots','training-progress',...

'StopTrainingCriteria','EpisodeReward',...

'StopTrainingValue',0,...

'ScoreAveragingWindowLength',5);

trainingStats = train(agent,env,trainOpts);

Second part

%Balance eq.

Pg=PL-Ppv-bpr*(Action1);

if(Pg>Z)

if(Pg-Z<=150)

PDG1=Pg(T)-Z;

PDG2=0;

F(T)=A*PDG1+B*Pr;

Pg=Z;

else

if(Pg-Z<350)

PDG2=Pg-Z;

F=A*PDG2+B*Pr2;

PDG1=0;

Pg=Z;

elseif(Pg-Z<500)

PDG2=350;

PDG1=(Pg-Z-PDG2)*Action2;

F=A*(PDG1+PDG2)+B*(Pr1*Action2+Pr2*Action3);

Pg=Pg-Z-PDG1-PDG2;

end

0 Commenti
Mostra -2 commenti meno recenti Nascondi -2 commenti meno recenti

Accedi per commentare.

Accedi per rispondere a questa domanda.

Follow Question

Answer 1

Emmanouil Tzorakoleftherakis il 13 Lug 2020

0 voti

This example shows how to create an environment with multiple discrete actions. Hope that helps

3 Commenti
Mostra 1 commento meno recente Nascondi 1 commento meno recente

Emmanouil Tzorakoleftherakis il 14 Lug 2020

All the elements are in ActionInfo.Elements. Is that what you need?

Nabil Jalil Aklo il 14 Lug 2020

Let me explain what I need in this example:

If I have action vector consist of three elements at time,

ActionInfo = rlFiniteSetSpec({[-1 0 0],[-1 1 0],[-1 0 1],[-1 1 1],[0 0 0],[0 1 0],[0 0 1],[0 1 1],[1 0 0],[1 1 0],[1 0 1],[1 1 1]});

At any time, let the action vector became Action=[-1 0 1] these element represent three decisions to control battery charging, first generator control and second generator control, at mean time I want to apply the first element of this vector on the equation below

SOC=SOC+200*(first element of the action vector)

the question is how can I abstruct the first element from the vector.

Thank you in advance.

Accedi per commentare.

Multi action agent programming in reinforcement learning

0 Commenti
Mostra -2 commenti meno recenti Nascondi -2 commenti meno recenti

Risposte (1)

3 Commenti
Mostra 1 commento meno recente Nascondi 1 commento meno recente

Categorie

Prodotti

Tag

Community Treasure Hunt

Multi action agent programming in reinforcement learning

0 Commenti Mostra -2 commenti meno recenti Nascondi -2 commenti meno recenti

Risposte (1)

3 Commenti Mostra 1 commento meno recente Nascondi 1 commento meno recente

Categorie

Prodotti

Tag

Vedere anche

Community Treasure Hunt

0 Commenti
Mostra -2 commenti meno recenti Nascondi -2 commenti meno recenti

3 Commenti
Mostra 1 commento meno recente Nascondi 1 commento meno recente