how to apply an action in an rl matlab environment

Question

Bryan il 27 Mar 2024

0
Link

Link diretto a questa domanda

https://it.mathworks.com/matlabcentral/answers/2099736-how-to-apply-an-action-in-an-rl-matlab-environment

Commentato: Bryan il 1 Apr 2024

Hello everyone,

I have a functioning environment, and now I want to make it more complex, but I'm not sure how. The code displays a summary and part of my environment. I have 5 objects on which I apply actions, and the possible actions are either 0 or 1 (32 possible actions). These actions are applied to element 7 of the array "mult". Now, I want to do the same but for elements 7, 10, 13, and 16. In other words, the 32 possible actions should be executed at these 4 instances, and the actions at these instances may be the same or different. Let me explain with an example: action 4 is performed at 7, action 24 at 10, action 4 at 13, and action 18 at 16. The only idea I came up with, although not feasible, is to define all possible combinations (20 objects, each with 0 or 1). Could you please provide some guidance?

% Observation information
ObservationInfo = rlNumericSpec([1 99]);
% Action information
ActionInfo = rlFiniteSetSpec({[0 0 0 0 0], ...
                              [0 0 0 0 1], ...
                              [0 0 0 1 0], ...
                              [0 0 0 1 1], ...
                              [0 0 1 0 0], ...
                              [0 0 1 0 1], ...
                              [0 0 1 1 0], ...
                              [0 0 1 1 1], ...
                              [0 1 0 0 0], ...
                              [0 1 0 0 1], ...
                              [0 1 0 1 0], ...
                              [0 1 0 1 1], ...
                              [0 1 1 0 0], ...
                              [0 1 1 0 1], ...
                              [0 1 1 1 0], ...
                              [0 1 1 1 1], ...
                              [1 0 0 0 0], ...
                              [1 0 0 0 1], ...
                              [1 0 0 1 0], ...
                              [1 0 0 1 1], ...
                              [1 0 1 0 0], ...
                              [1 0 1 0 1], ...
                              [1 0 1 1 0], ...
                              [1 0 1 1 1], ...
                              [1 1 0 0 0], ...
                              [1 1 0 0 1], ...
                              [1 1 0 1 0], ...
                              [1 1 0 1 1], ...
                              [1 1 1 0 0], ...
                              [1 1 1 0 1], ...
                              [1 1 1 1 0], ...
                              [1 1 1 1 1]});
function [NextObservation, Reward, IsDone, UpdatedInfo] = StepFunctTest(Action, Info)
% Start communication with OpenDSS
% Load Info into loads and irradiance
% Apply action
DSSText.command = sprintf('New LoadShape.estado_cap_1   npts=24   interval=1   mult=(0 0 0 0 0 0 %d 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0)', Action(1));
DSSText.command = sprintf('New LoadShape.estado_cap_2   npts=24   interval=1   mult=(0 0 0 0 0 0 %d 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0)', Action(2));
DSSText.command = sprintf('New LoadShape.estado_cap_3   npts=24   interval=1   mult=(0 0 0 0 0 0 %d 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0)', Action(3));
DSSText.command = sprintf('New LoadShape.estado_cap_4   npts=24   interval=1   mult=(0 0 0 0 0 0 %d 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0)', Action(4));
DSSText.command = sprintf('New LoadShape.estado_cap_5   npts=24   interval=1   mult=(0 0 0 0 0 0 %d 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0)', Action(5));
% Configure OpenDSS simulation
% Solve the system
% Obtain NextObservation (voltages)
% Calculate reward
% Determine if the episode is done
IsDone = Reward ~= 0;
end

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

Accedi per rispondere a questa domanda.

Answer 1

Maneet Kaur Bagga il 29 Mar 2024

0
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/2099736-how-to-apply-an-action-in-an-rl-matlab-environment#answer_1433266

Apri in MATLAB Online

Hi,

To apply actions to multiple elements (7, 10, 13, and 16) in the "mult" array, you can use a multi-dimensional action space. Each dimension of this space represents the action to be applied at one of the specified elements. Given that each action can be one of 32 possible states for a single element, when dealing with 4 elements independently, the total number of combinations becomes (32^4 = 1,048,576). Then you'll need to encode and decode the actions in your environment's step function.

Encoding Actions:

You can encode the actions as integers. For example, an action could be represented as a single integer in the range ([0, 1048575]) (which is (32^4 - 1)). This integer can then be decoded into the 4 actions for the elements 7, 10, 13, and 16.

To encode and decode actions, you can use base-32 representation since you have 32 possible states for each action.

Decoding Actions in the Step Function:

When you receive an action in your "StepFunctTest", it will be a single integer. You need to decode this integer into 4 separate actions. Here's how you could do it:

function [NextObservation, Reward, IsDone, UpdatedInfo] = StepFunctTest(Action, Info)
    % Decode the single action into 4 separate actions
    actions = zeros(1, 4); % Initialize array to hold decoded actions
    for i = 1:4
        actions(i) = mod(Action, 32); % Get remainder (current action)
        Action = floor(Action / 32); % Reduce Action for the next iteration
    end
    % Now, actions(1), actions(2), actions(3), and actions(4) represent the actions
    % for elements 7, 10, 13, and 16 respectively
    % Apply actions
    % Assuming you have a way to map the 0-31 action to your desired binary format
    % For simplicity, assuming actionMap is a function that maps the action index to its binary representation
    DSSText.command = sprintf('New LoadShape.estado_cap_1 npts=24 interval=1 mult=(0 0 0 0 0 0 %d 0 0 0 %d 0 0 0 %d 0 0 0 %d 0 0 0 0 0)', actionMap(actions(1)), actionMap(actions(2)), actionMap(actions(3)), actionMap(actions(4)));
    % Continue with the rest of your step function
end
function binaryAction = actionMap(index)
    % Here you would map the index (0-31) to its corresponding binary action
    % This is a placeholder function. You need to implement the mapping based on your specific needs
    binaryAction = [0 0 0 0 0]; % Example placeholder return value
end

This workaround allows you to extend your RL environment to handle actions on multiple elements without explicitly defining every possible combination.

Hope this helps!

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Bryan il 1 Apr 2024

Apri in MATLAB Online

Hi.

Thank you for your response. It allowed me to broaden my search. I'm not sure if what I applied is what you're trying to tell me. Nonetheless, I arrived at an alternative. Could you review if what I implemented is correct? I have another question: With so many possible combinations (32^4 = 1,048,576), will the DQN agent be able to learn? Will the problem converge with so many combinations?

% 32 possible actions of 4 elements.
t1 = 0:31; % 7h
t2 = 0:31; % 10h
t3 = 0:31; % 13h
t4 = 0:31; % 16h
[T4, T3, T2, T1] = ndgrid(t4, t3, t2, t1);
actions = reshape(cat(5, T1, T2, T3, T4), [], 4);
ActionInfo = rlFiniteSetSpec(num2cell(actions, 2));

In my StepFunctTest:

function [NextObservation, Reward, IsDone, UpdatedInfo] = StepFunctTest(Action, Info)
% rest of the step funtion
% process the actions and obtain the actions in binary number
[Action7h, Action10h, Action13h, Action16h] = BinaryAction(Action);
% apply the actions
DSSText.command = sprintf('New LoadShape.estado_cap_1   npts=24   interval=1   mult=(0 0 0 0 0 0 %d 0 0 %d 0 0 %d 0 0 %d 0 0 0 0 0 0 0 0)',Action7h(1), Action10h(1), Action13h(1), Action16h(1));
DSSText.command = sprintf('New LoadShape.estado_cap_2   npts=24   interval=1   mult=(0 0 0 0 0 0 %d 0 0 %d 0 0 %d 0 0 %d 0 0 0 0 0 0 0 0)',Action7h(2), Action10h(2), Action13h(2), Action16h(2));
DSSText.command = sprintf('New LoadShape.estado_cap_3   npts=24   interval=1   mult=(0 0 0 0 0 0 %d 0 0 %d 0 0 %d 0 0 %d 0 0 0 0 0 0 0 0)',Action7h(3), Action10h(3), Action13h(3), Action16h(3));
DSSText.command = sprintf('New LoadShape.estado_cap_4   npts=24   interval=1   mult=(0 0 0 0 0 0 %d 0 0 %d 0 0 %d 0 0 %d 0 0 0 0 0 0 0 0)',Action7h(4), Action10h(4), Action13h(4), Action16h(4));
DSSText.command = sprintf('New LoadShape.estado_cap_5   npts=24   interval=1   mult=(0 0 0 0 0 0 %d 0 0 %d 0 0 %d 0 0 %d 0 0 0 0 0 0 0 0)',Action7h(5), Action10h(5), Action13h(5), Action16h(5));
% rest of the step funtion
end

Function to convert Action to binary and apply each action:

function [Action7h, Action10h, Action13h, Action16h] = BinaryAction(Action)
Action7hDec = Action(1);
Action10hDec = Action(2);
Action13hDec = Action(3);
Action16hDec = Action(4);
% Convert to 5-bit binary using arithmetic and logical operations
Action7h = bitget(Action7hDec, 5:-1:1);
Action10h = bitget(Action10hDec, 5:-1:1);
Action13h = bitget(Action13hDec, 5:-1:1);
Action16h = bitget(Action16hDec, 5:-1:1);
end

Thank you again for your previous response, I will be attentive to your prompt reply.

Bryan.

Accedi per commentare.

how to apply an action in an rl matlab environment

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposta accettata

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Più risposte (0)

Vedere anche

Categorie

Tag

Prodotti

Release

Community Treasure Hunt

how to apply an action in an rl matlab environment

0 Commenti Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposta accettata

1 Commento Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Più risposte (0)

Vedere anche

Categorie

Tag

Prodotti

Release

Community Treasure Hunt

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti