SARSA Reinforcement Learning

Maze solving using SARSA, Reinforcement Learning
1,6K download
Aggiornato 24 mag 2017

Visualizza la licenza

Refer to 6.4 (Sarsa: On-Policy TD Control), Reinforcement learning: An introduction, RS Sutton, AG Barto , MIT press
In this demo, two different mazes have been solved by Reinforcement Learning technique, SARSA.
State-Action-Reward-State-Action (SARSA) is an algorithm for learning a Markov decision process policy, used in the reinforcement learning.
SARSA, Updation of Action-Value Function:

Q(S{t}, A{t}) := Q(S{t}, A{t}) + α*[ R{t+1} + γ ∗ Q(S{t+1}, A{t+1}) − Q(S{t}, A{t}) ]

Learning rate (α)
The learning rate determines to what extent the newly acquired information will override the old information. A factor of 0 will make the agent not learn anything, while a factor of 1 would make the agent consider only the most recent information.

Discount factor (γ)
The discount factor determines the importance of future rewards. A factor of 0 will make the agent "opportunistic" by only considering current rewards, while a factor approaching 1 will make it strive for a long-term high reward. If the discount factor meets or exceeds 1, the Q values may diverge.

Note: Convergence is tested on particular examples, in general convergence is not sure for above demo.

Cita come

Bhartendu (2024). SARSA Reinforcement Learning (https://www.mathworks.com/matlabcentral/fileexchange/63089-sarsa-reinforcement-learning), MATLAB Central File Exchange. Recuperato .

Compatibilità della release di MATLAB
Creato con R2016a
Compatibile con qualsiasi release
Compatibilità della piattaforma
Windows macOS Linux
Categorie
Scopri di più su Labyrinth problems in Help Center e MATLAB Answers

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!
Versione Pubblicato Note della release
1.0.0.0