Error appears when setting the multi-dimensional actions in Matlab Environment (Reinforcement Learning Toolbox)

Question

wujianfa93 il 27 Mag 2020

0
Link

Link diretto a questa domanda

https://it.mathworks.com/matlabcentral/answers/535048-error-appears-when-setting-the-multi-dimensional-actions-in-matlab-environment-reinforcement-learni

Commentato: Ryan Comeau il 1 Giu 2020

As shown in the following codes, three actions, whose ranges are [0.1 10], [0.1 10] and [0 pi], respectively, are set:

%% main.m
%% Observation
ObservationInfo = rlNumericSpec([7 1]);
ObservationInfo.Name = 'Obstacle Avoidance States';
ObservationInfo.Description = 'delta_x, delta_y, delta_z, delta_L, delta_V, pusi, theta';
%% Action
ActionInfo = rlNumericSpec([3 1],'LowerLimit',[0.1 0.1 0]','UpperLimit',[10 10 pi]');
ActionInfo.Name = 'Action'; 
%% Environment
env = rlFunctionEnv(ObservationInfo,ActionInfo,'myStepFunction','myResetFunction');
rng(0);
InitialObs = reset(env);

Then, the actions are assigned three variables in the function myStepFunction (Action, LoggedSignals) as follows:

function [NextObs,Reward,IsDone,LoggedSignals] = myStepFunction(Action,LoggedSignals)
para_rho1 = Action(1);
para_rho2 = Action(2);
para_theta = Action(3);
......
end

Run the main.m, it is normal.

However, when running the following instruction, an error appears:

step(env,10)

Index exceeds the number of array elements (1)

Error: myStepFunction (line 23)

para_rho2 = Action(2);

Why does the dimension of the action is changed as 1? How to address this error?

If I set the variables para_rho2 and para_theta as constants, and change the dimension of the action as [1 1] in rlNumericSpec, then the instruction step(env,10) can be normally executed.

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

Accedi per rispondere a questa domanda.

Answer 1

Ryan Comeau il 29 Mag 2020

0
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/535048-error-appears-when-setting-the-multi-dimensional-actions-in-matlab-environment-reinforcement-learni#answer_442283

Apri in MATLAB Online

Hello, so I've take a look at the rocket lander code environment which MATLAB gives as an example. What they do, it that every action is scaled between 0 and 1. The maximum values for the actions are then stored in your environment properties as a vector of values defining the min and max values. When we hop into the step function, the actions get scaled. I know this seems strange, and i'm not sure if it's the best approach but i'm not a veteran of RL. So, what you should do is the following:

%% main.m
%% Observation
ObservationInfo = rlNumericSpec([7 1]);
ObservationInfo.Name = 'Obstacle Avoidance States';
ObservationInfo.Description = 'delta_x, delta_y, delta_z, delta_L, delta_V, pusi, theta';
%% Action
ActionInfo = rlNumericSpec([3 1 1],'LowerLimit',0,'UpperLimit',1);
ActionInfo.Name = 'Action'; 
%stuff...
function [NextObs,Reward,IsDone,LoggedSignals] = myStepFunction(Action,LoggedSignals)
para_rho1 = Action(1).*env.borders(1); %so open rocket lander from MATLAB and take a look
para_rho2 = Action(2).*env.borders(2);
para_theta = Action(3).*env.borders(3);
......
end

Hope this helps

RC

2 Commenti
Mostra NessunoNascondi Nessuno

wujianfa93 il 1 Giu 2020

Modificato: wujianfa93 il 1 Giu 2020

Many thanks for your reply! But I think I have solved this problem. When I continued designing the corresponding DDPG training process and start training, I found that the whole programme can correctly run. So I guess the verification using the function step() may not be necessary.

Ryan Comeau il 1 Giu 2020

Awesome, nice work.

Accedi per commentare.

Error appears when setting the multi-dimensional actions in Matlab Environment (Reinforcement Learning Toolbox)

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposte (1)

2 Commenti
Mostra NessunoNascondi Nessuno

Vedere anche

Categorie

Tag

Prodotti

Release

Community Treasure Hunt

Error appears when setting the multi-dimensional actions in Matlab Environment (Reinforcement Learning Toolbox)

0 Commenti Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposte (1)

2 Commenti Mostra NessunoNascondi Nessuno

Vedere anche

Categorie

Tag

Prodotti

Release

Community Treasure Hunt

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

2 Commenti
Mostra NessunoNascondi Nessuno