Action of the RL agents actions change when deployed in a different enviornment

2 visualizzazioni (ultimi 30 giorni)

Jayalath Achchige Damsara Udan Jayarathne il 19 Mar 2023

0
Link

Link diretto a questa domanda

https://it.mathworks.com/matlabcentral/answers/1931555-action-of-the-rl-agents-actions-change-when-deployed-in-a-different-enviornment

Commentato: Jayalath Achchige Damsara Udan Jayarathne il 23 Mar 2023

Hi all,

I have an RL agent trained in a environment (env 1- a simulink model). The sample time of the agent is 0.1s. It uses a variable step solver (DormanPrince) to solve each episode. After training is complete I export the same agent to a different enviornemnt (env 2 - a more complex environement compared to env 1) and deploy without changing any parameters. This environement does not have any randomness built into it. Also it is solved suing a variable step size solver (DormanPrince). However, when I run simulations with env 2 with the same initial condtions, I get different results. (Les say the trajectory I am calculating changes each time I run the simulation).

Why does this happen even when there is no randomness in the simulation model? If I run the simulation without the agent with same initial conditions I get a solution which does not change everytime I run it.

Plese let me know if anyone has encountered this. Thank you!!

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

Accedi per rispondere a questa domanda.

Risposte (1)

Emmanouil Tzorakoleftherakis il 20 Mar 2023

0
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/1931555-action-of-the-rl-agents-actions-change-when-deployed-in-a-different-enviornment#answer_1197040

Modificato: Emmanouil Tzorakoleftherakis il 20 Mar 2023

A couple of suggestions/comments:

1) You mentioned env1 and env2 are different - why are you expecting to see the same results? Variable step solvers can lead to different numerical results if the stiffness of the underlying equations that are integrated changes. Even if env1 and env2 are modeling the same system, the more complex version is likely to be more stiff, which can change the simulation results.

2) Which agent are you using? Some agents are stochastic by nature, so unless you fix the random see generator, you will see different results