Easy way to evaluate / compare the performance of RL algorithm

Question

Saurav Sthapit il 29 Lug 2020

0
Link

Link diretto a questa domanda

https://it.mathworks.com/matlabcentral/answers/572359-easy-way-to-evaluate-compare-the-performance-of-rl-algorithm

Modificato: Saurav Sthapit il 6 Ago 2020

I have a RL agent trained and would like to compare its performance with a dumb agent. I can run simout=sim(env,agent,simOpts) to evaluate the actual agent. But, I would like to compare the simulation results with a couple of dumb agents which always has the same action or random action. Is there any easy way to do this?

Currently, I have a seperate simulink model without RL agent block (replaced with constant block) and logging Observation and rewards using Simulation Data Inspector.

Thanks

Saurav

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

Accedi per rispondere a questa domanda.

Answer 1

Emmanouil Tzorakoleftherakis il 3 Ago 2020

1
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/572359-easy-way-to-evaluate-compare-the-performance-of-rl-algorithm#answer_474718

Why not use a MATLAB Fcn block and implement the dummy agent in there? If you want random/constant actions should be just one line.

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Saurav Sthapit il 6 Ago 2020

Modificato: Saurav Sthapit il 6 Ago 2020

Thanks, thats an excellent suggestion for evaluating random actions.

However, when I do that (or use constant blocks), I have to run two statements below: first one for evaluating random/dumb action and one for evaluating the agent.

logsout=sim(mdl)

simout=sim(env,agent,simOpts)

logsout and simout are not directly comparable, but logsout is a field in the simout.SimulationInfo struct.

I am wondering if this is the best approach or if there is a easy way to do this.

Also, simout contains action, observation and reward but if the reward is weighted sum of multiple rewards, I can't access the individual rewards. ( Of course, i can compare logsout with simout.logsout)

Accedi per commentare.

Easy way to evaluate / compare the performance of RL algorithm

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposte (1)

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Vedere anche

Categorie

Tag

Prodotti

Release

Community Treasure Hunt

Easy way to evaluate / compare the performance of RL algorithm

0 Commenti Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposte (1)

1 Commento Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Vedere anche

Categorie

Tag

Prodotti

Release

Community Treasure Hunt

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti