Is it possible to change RL action values under certain conditions?
Mostra commenti meno recenti
I want my agent to output a target value, but in certain situations (reward drops dramatically), I would want the agent to look for a better solution by letting him change the target value. I tried to use initial condition block in order to use the target value in the first place. However, my agent (PPO) always outputs an average value after some training episodes.
5 Commenti
Emmanouil Tzorakoleftherakis
il 18 Mag 2021
Can you provide some more information? What do you mean by letting the agent change target value? Isn't that what is happening by default every time the agent takes an action? what is the envronment architecture?
black_cat
il 18 Mag 2021
Emmanouil Tzorakoleftherakis
il 19 Mag 2021
thanks. It's still not clear to me what you mean by "However, this results in having an output of 3 since the agent is averaging it during training". If it's best to output a 6, the agent should do so, why would it average the output? Unless you are talking about the average episode reward that you see in the episode manager?
Risposte (0)
Categorie
Scopri di più su Reinforcement Learning Toolbox in Centro assistenza e File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!