Create custom policy function for a RL DQN.

Question

Yiyang Zhou il 20 Nov 2019

0
Link

Link diretto a questa domanda

https://it.mathworks.com/matlabcentral/answers/492237-create-custom-policy-function-for-a-rl-dqn

Risposto: Anh Tran il 27 Mar 2020

Hi Community,

I am working on a project that requires me to have a little bit modification of the DQN policy. The learned function is still Q, but instead of taking the argmax Q(s,a), I have a few more conditions added (most likely some if statement as hard constraints). I am wondering if it is ever possible for me to make this change? If so, where should i work on?

Best regards,

Yiyang

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

Accedi per rispondere a questa domanda.

Answer 1

Anh Tran il 27 Mar 2020

0
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/492237-create-custom-policy-function-for-a-rl-dqn#answer_422402

Apri in MATLAB Online

Currently there I do not see any workaround to modify DQN policy directly with buit-in rlDQNAgent. A possible workaround is to reimplement DQN agent with rlQValueRepresentation, introduced in MATLAB R2020a

You can refer to RL custom train loop example where we implement vanilla policy gradients with RL Toolbox.

For discrete action, I would recommend multi-output Q value representation Q(o) (better performance than Q(o,a)).

% create Q(o) critic, assumed you defined NeuralNet,ObservationInfo,ActionInfo
Critic = rlQValueRepresentation(NeuralNet,ObservationInfo,ActionInfo,'Observation',ObsLayerName);
% get state-action values of an observation RandomObservation
Q = getValue(Critic,RandomObservation)

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

Create custom policy function for a RL DQN.

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposte (1)

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Vedere anche

Categorie

Tag

Prodotti

Release

Community Treasure Hunt

Create custom policy function for a RL DQN.

0 Commenti Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposte (1)

0 Commenti Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Vedere anche

Categorie

Tag

Prodotti

Release

Community Treasure Hunt

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti