how to apply an action in an rl matlab environment

4 visualizzazioni (ultimi 30 giorni)
Hello everyone,
I have a functioning environment, and now I want to make it more complex, but I'm not sure how. The code displays a summary and part of my environment. I have 5 objects on which I apply actions, and the possible actions are either 0 or 1 (32 possible actions). These actions are applied to element 7 of the array "mult". Now, I want to do the same but for elements 7, 10, 13, and 16. In other words, the 32 possible actions should be executed at these 4 instances, and the actions at these instances may be the same or different. Let me explain with an example: action 4 is performed at 7, action 24 at 10, action 4 at 13, and action 18 at 16. The only idea I came up with, although not feasible, is to define all possible combinations (20 objects, each with 0 or 1). Could you please provide some guidance?
% Observation information
ObservationInfo = rlNumericSpec([1 99]);
% Action information
ActionInfo = rlFiniteSetSpec({[0 0 0 0 0], ...
[0 0 0 0 1], ...
[0 0 0 1 0], ...
[0 0 0 1 1], ...
[0 0 1 0 0], ...
[0 0 1 0 1], ...
[0 0 1 1 0], ...
[0 0 1 1 1], ...
[0 1 0 0 0], ...
[0 1 0 0 1], ...
[0 1 0 1 0], ...
[0 1 0 1 1], ...
[0 1 1 0 0], ...
[0 1 1 0 1], ...
[0 1 1 1 0], ...
[0 1 1 1 1], ...
[1 0 0 0 0], ...
[1 0 0 0 1], ...
[1 0 0 1 0], ...
[1 0 0 1 1], ...
[1 0 1 0 0], ...
[1 0 1 0 1], ...
[1 0 1 1 0], ...
[1 0 1 1 1], ...
[1 1 0 0 0], ...
[1 1 0 0 1], ...
[1 1 0 1 0], ...
[1 1 0 1 1], ...
[1 1 1 0 0], ...
[1 1 1 0 1], ...
[1 1 1 1 0], ...
[1 1 1 1 1]});
function [NextObservation, Reward, IsDone, UpdatedInfo] = StepFunctTest(Action, Info)
% Start communication with OpenDSS
% Load Info into loads and irradiance
% Apply action
DSSText.command = sprintf('New LoadShape.estado_cap_1 npts=24 interval=1 mult=(0 0 0 0 0 0 %d 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0)', Action(1));
DSSText.command = sprintf('New LoadShape.estado_cap_2 npts=24 interval=1 mult=(0 0 0 0 0 0 %d 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0)', Action(2));
DSSText.command = sprintf('New LoadShape.estado_cap_3 npts=24 interval=1 mult=(0 0 0 0 0 0 %d 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0)', Action(3));
DSSText.command = sprintf('New LoadShape.estado_cap_4 npts=24 interval=1 mult=(0 0 0 0 0 0 %d 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0)', Action(4));
DSSText.command = sprintf('New LoadShape.estado_cap_5 npts=24 interval=1 mult=(0 0 0 0 0 0 %d 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0)', Action(5));
% Configure OpenDSS simulation
% Solve the system
% Obtain NextObservation (voltages)
% Calculate reward
% Determine if the episode is done
IsDone = Reward ~= 0;
end

Risposta accettata

Maneet Kaur Bagga
Maneet Kaur Bagga il 29 Mar 2024
Hi,
To apply actions to multiple elements (7, 10, 13, and 16) in the "mult" array, you can use a multi-dimensional action space. Each dimension of this space represents the action to be applied at one of the specified elements. Given that each action can be one of 32 possible states for a single element, when dealing with 4 elements independently, the total number of combinations becomes (32^4 = 1,048,576). Then you'll need to encode and decode the actions in your environment's step function.
Encoding Actions:
You can encode the actions as integers. For example, an action could be represented as a single integer in the range ([0, 1048575]) (which is (32^4 - 1)). This integer can then be decoded into the 4 actions for the elements 7, 10, 13, and 16.
To encode and decode actions, you can use base-32 representation since you have 32 possible states for each action.
Decoding Actions in the Step Function:
When you receive an action in your "StepFunctTest", it will be a single integer. You need to decode this integer into 4 separate actions. Here's how you could do it:
function [NextObservation, Reward, IsDone, UpdatedInfo] = StepFunctTest(Action, Info)
% Decode the single action into 4 separate actions
actions = zeros(1, 4); % Initialize array to hold decoded actions
for i = 1:4
actions(i) = mod(Action, 32); % Get remainder (current action)
Action = floor(Action / 32); % Reduce Action for the next iteration
end
% Now, actions(1), actions(2), actions(3), and actions(4) represent the actions
% for elements 7, 10, 13, and 16 respectively
% Apply actions
% Assuming you have a way to map the 0-31 action to your desired binary format
% For simplicity, assuming actionMap is a function that maps the action index to its binary representation
DSSText.command = sprintf('New LoadShape.estado_cap_1 npts=24 interval=1 mult=(0 0 0 0 0 0 %d 0 0 0 %d 0 0 0 %d 0 0 0 %d 0 0 0 0 0)', actionMap(actions(1)), actionMap(actions(2)), actionMap(actions(3)), actionMap(actions(4)));
% Continue with the rest of your step function
end
function binaryAction = actionMap(index)
% Here you would map the index (0-31) to its corresponding binary action
% This is a placeholder function. You need to implement the mapping based on your specific needs
binaryAction = [0 0 0 0 0]; % Example placeholder return value
end
This workaround allows you to extend your RL environment to handle actions on multiple elements without explicitly defining every possible combination.
Hope this helps!
  1 Commento
Bryan
Bryan il 1 Apr 2024
Hi.
Thank you for your response. It allowed me to broaden my search. I'm not sure if what I applied is what you're trying to tell me. Nonetheless, I arrived at an alternative. Could you review if what I implemented is correct? I have another question: With so many possible combinations (32^4 = 1,048,576), will the DQN agent be able to learn? Will the problem converge with so many combinations?
% 32 possible actions of 4 elements.
t1 = 0:31; % 7h
t2 = 0:31; % 10h
t3 = 0:31; % 13h
t4 = 0:31; % 16h
[T4, T3, T2, T1] = ndgrid(t4, t3, t2, t1);
actions = reshape(cat(5, T1, T2, T3, T4), [], 4);
ActionInfo = rlFiniteSetSpec(num2cell(actions, 2));
In my StepFunctTest:
function [NextObservation, Reward, IsDone, UpdatedInfo] = StepFunctTest(Action, Info)
% rest of the step funtion
% process the actions and obtain the actions in binary number
[Action7h, Action10h, Action13h, Action16h] = BinaryAction(Action);
% apply the actions
DSSText.command = sprintf('New LoadShape.estado_cap_1 npts=24 interval=1 mult=(0 0 0 0 0 0 %d 0 0 %d 0 0 %d 0 0 %d 0 0 0 0 0 0 0 0)',Action7h(1), Action10h(1), Action13h(1), Action16h(1));
DSSText.command = sprintf('New LoadShape.estado_cap_2 npts=24 interval=1 mult=(0 0 0 0 0 0 %d 0 0 %d 0 0 %d 0 0 %d 0 0 0 0 0 0 0 0)',Action7h(2), Action10h(2), Action13h(2), Action16h(2));
DSSText.command = sprintf('New LoadShape.estado_cap_3 npts=24 interval=1 mult=(0 0 0 0 0 0 %d 0 0 %d 0 0 %d 0 0 %d 0 0 0 0 0 0 0 0)',Action7h(3), Action10h(3), Action13h(3), Action16h(3));
DSSText.command = sprintf('New LoadShape.estado_cap_4 npts=24 interval=1 mult=(0 0 0 0 0 0 %d 0 0 %d 0 0 %d 0 0 %d 0 0 0 0 0 0 0 0)',Action7h(4), Action10h(4), Action13h(4), Action16h(4));
DSSText.command = sprintf('New LoadShape.estado_cap_5 npts=24 interval=1 mult=(0 0 0 0 0 0 %d 0 0 %d 0 0 %d 0 0 %d 0 0 0 0 0 0 0 0)',Action7h(5), Action10h(5), Action13h(5), Action16h(5));
% rest of the step funtion
end
Function to convert Action to binary and apply each action:
function [Action7h, Action10h, Action13h, Action16h] = BinaryAction(Action)
Action7hDec = Action(1);
Action10hDec = Action(2);
Action13hDec = Action(3);
Action16hDec = Action(4);
% Convert to 5-bit binary using arithmetic and logical operations
Action7h = bitget(Action7hDec, 5:-1:1);
Action10h = bitget(Action10hDec, 5:-1:1);
Action13h = bitget(Action13hDec, 5:-1:1);
Action16h = bitget(Action16hDec, 5:-1:1);
end
Thank you again for your previous response, I will be attentive to your prompt reply.
Bryan.

Accedi per commentare.

Più risposte (0)

Prodotti


Release

R2023b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by