強化学習のシミュレーションでActionの選択が固定される

1 Lug 2022

0 Risposte

Aggiornato 11 Lug 2022

8 Visualizzazioni (30 giorni)

Accedi per rispondere a questa domanda.

Follow Question

Accedi per rispondere a questa domanda.

Follow Question

Mostra commenti meno recenti

0 voti

現在、MATLABのReinforcement Learning Toolboxを使用して強化学習を行っています。（DQNAgentを使用しています。）

trainingでは設定しているActionで様々な行動をしてくれますが、シミュレーションでは毎回最初に選んだActionのみをシミュレーションの最後まで選び続けてしまうので学習結果が適切に反映されません。

また、学習結果が偏りすぎていてシミュレーションで特定のActionしか選択していないのではと思い、MaxEpisodesを１にしてtrainingを実行しましたが、やはりシミュレーションでは最初に選んだActionを最後まで選び続けてしまいます。

どのようにすればシミュレーションでもActionを選択して行動してくれるのでしょうか？

以下に私のプログラムでの設定を記載します。

ActionInfo = rlFiniteSetSpec([3 4 5 6]);

StepHandle = @(Action,LoggedSignals)RLStepFunction(Action,LoggedSignals,RL);

agent = rlDQNAgent(ObservationInfo, ActionInfo);

MaxEpisodes=1;

opt = rlTrainingOptions("MaxEpisodes",MaxEpisodes,...

"MaxStepsPerEpisode",length(NumberOfSteps),...

"StopTrainingCriteria","AverageReward",...

"StopTrainingValue",100000,...

"ScoreAveragingWindowLength",50,...

'SaveAgentCriteria','EpisodeCount',...

'SaveAgentValue',MaxEpisodes);

agentOpts = rlDQNAgentOptions;

agentOpts.SaveExperienceBufferWithAgent = true;

agentOpts.ResetExperienceBufferBeforeTraining = false;

agentOpts.NumStepsToLookAhead = 1;

%doTraining = true;

doTraining = 0;

if doTraining == true

% Train the agent.

trainingStats = train(agent,env,opt);

else

load('savedAgents/Agent1.mat','saved_agent')

simOpts = rlSimulationOptions('MaxSteps',200);%,"NumSimulations",1);

experience = sim(env,agent,simOpts);

end

また、stepfunctionでのActionの部分も記載します。

switch Action

case 3

p = 3;

case 4

p = 4;

case 5

p = 5;

case 6

p = 6;

end

4 Commenti
Mostra 2 commenti meno recenti Nascondi 2 commenti meno recenti

Toshinobu Shintai il 4 Lug 2022

強化学習におけるobservationとは、agentが環境から取得できる値のことを指しています。

申し訳ありませんが、何が問題なのかが現状では分かりませんので、モデルや図を添付して説明していただけるとありがたいです。

MF il 11 Lug 2022

Modificato: MF il 11 Lug 2022

アドバイスありがとうございました。

ご指摘の部分を再度確認した所、うまくobservationが認識されていなかったようです。プログラムのobservationの認識に関わる部分のミスがあったため出来なかったようです。

現在は学習結果を反映することが出来るようになりました。

ありがとうございました。

Accedi per commentare.

Accedi per rispondere a questa domanda.

Follow Question

Risposte (0)

Accedi per rispondere a questa domanda.

Categorie

Scopri di più su Reinforcement Learning Toolbox in Centro assistenza e File Exchange

Prodotti

Reinforcement Learning Toolbox

Release

R2022a

Tag

Richiesto:

il 1 Lug 2022

Modificato:

il 11 Lug 2022

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

強化学習のシミュレー​ションでAction​の選択が固定される

4 Commenti Mostra 2 commenti meno recenti Nascondi 2 commenti meno recenti

Risposte (0)

Categorie

Prodotti

Release

Tag

Vedere anche

Community Treasure Hunt

強化学習のシミュレーションでActionの選択が固定される

4 Commenti
Mostra 2 commenti meno recenti Nascondi 2 commenti meno recenti