Why is the DDPG episode rewards never change during the whole training process?
7 visualizzazioni (ultimi 30 giorni)
Mostra commenti meno recenti
Guoge Tan
il 25 Mag 2020
Commentato: Shahriar
il 29 Giu 2022
I'm training a DDPG agent using the Reinforcement Learning toolbox on MATLAB R2020a for a path planning problem. But as you can see, the DDPG episode rewards and average rewards never change during 5000 episodes. I used a simple neural networks with 20 neurons and three layers, the learning rate is set to 0.01, and the Gradient Threshold is 1. Then I try to set weights and bias for fully connected layers and change my reward function, but the result is the same.
![](https://www.mathworks.com/matlabcentral/answers/uploaded_files/300253/image.png)
1 Commento
Risposta accettata
Emmanouil Tzorakoleftherakis
il 26 Mag 2020
Looks like the scale between Q0 and episode reward is very different. Try unchecking "Show Episode Q0" to see of the episode reward changes. I would then simplify the critic network to make sure it outputs values in a similar scale as the episode reward.
0 Commenti
Più risposte (0)
Vedere anche
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!