Reinforcement Learning - Multiple Discrete Actions

23 visualizzazioni (ultimi 30 giorni)
I would like to use a DQN agent where I have multiple continuous states (or observations) and two action signals, each with three possible values for a total of 9 combinations. For example, see the next lines to understand what I mean:
a = [-2,0,2];
b = [-3,0,3];
[A,B] = meshgrid(a,b);
actions = reshape(cat(2,A',B'),[],2);
If I want to create discrete actions, I need to convert the matrix into a cell and run the command:
actionInfo = rlFiniteSetSpec(num2cell(actions,2));
actionInfo.Name = 'actions';
Additionally, in DQN, you have a critic, which comprises of a deep neural network. I have created the critic as follows:
% Create a DNN for the critic:
hiddenLayerSize = 48;
observationPath = [
imageInputLayer([numObs 1 1],'Normalization','none',...
actionPath = [
imageInputLayer([value 1 1],'Normalization','none','Name','action')
% Create the layerGraph:
criticNetwork = layerGraph(observationPath);
criticNetwork = addLayers(criticNetwork,actionPath);
% Connect actionPath to obervationPath:
criticNetwork = connectLayers(criticNetwork,'CriticActionFC1','add/in2');
% Specify options for the critic representation:
criticOpts = rlRepresentationOptions('LearnRate',1e-03,...
% Create the critic representation using the specified DNN and options:
critic = rlRepresentation(criticNetwork,observationInfo,actionInfo,...
% set the desired options for the agent:
agentOptions = rlDQNAgentOptions(...
My problem is the first image input layer to the action path imageInputLayer([value 1 1],'Normalization','none','Name','action'). I have tried values of 1, 2, 9 and 18 for value, but all results in an error when I run
agent = rlDQNAgent(critic,agentOptions);
This is because actionInfo has a cell of 9 elements, each with a double vector of dimensions [1,2], whereas the imageInputLayer is expecting dimensions [value,1,1].
So, how can I set up a DQN agent in MATLAB with two main discrete action signals, each with three possible values?
Many thanks in advance for the help!
  2 Commenti
Clemens Fricke
Clemens Fricke il 11 Lug 2019
I am not sure If I should open a new thread for this but since it is very close to this question I will try to ask here first.
I am trying to use the PG Agent with multiple discrete Actions and I have no idea how my last Layer of the action Network should look like.
I have [2,62] Actions (2 Parameters with each 62 discrete states) and the output layer only accepts a positiv integer and not vectors. I have tried 2 for the number of parameters and 124 for the number of possible actions. Both get me the same error:
Error using categorical (line 337)
Could not find unique values in VALUESET using the UNIQUE function.
Error in rl.util.rlLayerRepresentation/buildNetwork (line 719)
categorical(ActionValues, ActionValues);
Error in rl.util.rlLayerRepresentation/setLoss (line 175)
this = buildNetwork(this);
Error in rl.agent.rlPGAgent/setActorRepresentation (line 339)
actor = setLoss(actor,'cte','EntropyLossWeight',opt.EntropyLossWeight);
Error in rl.agent.rlPGAgent (line 47)
this = setActorRepresentation(this,actor,opt);
Error in rlPGAgent (line 21)
Agent = rl.agent.rlPGAgent(varargin{:});
Error in DQN (line 67)
agent = rlPGAgent(actor,baseline,agentOpts);
Caused by:
Error using cell/unique (line 85)
Cell array input must be a cell array of character vectors.
I have attached the file to this Comment.
Enrico Anderlini
Enrico Anderlini il 30 Ago 2019
Sorry, but I have just seen this.
Do you have 62 states and 2 actions? Or 2 states, 62 actions? Or 124 actions?
I would not recommend a large number of actions, as it will cause learning problems.

Accedi per commentare.

Risposta accettata

Emmanouil Tzorakoleftherakis
Hi Enrico,
actionPath = [
imageInputLayer([1 2 1],'Normalization','none','Name','action')
Each action in your code is 1x2, which should be reflected in the dimensions of the actionpath input.
I hope this helps.
  1 Commento
Enrico Anderlini
Enrico Anderlini il 12 Giu 2019
Hi Emmanouil ,
many thanks for the help!
It works great. I would add that then you need a reshape block in Simulink, but that is no problem. It is much faster than a mapping with an additional C-coded S-function.

Accedi per commentare.

Più risposte (1)

kqha1025 kqha1025
kqha1025 kqha1025 il 23 Giu 2022
Do you have any function finding "actions" for a general case with multiple actions, e.g. 3 actions with three arrays respectively? (this case has two action arrays)
Thank you very much.




Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by