Parallel Training rlDQNAgents with parfor fails for high agents numbers

1 visualizzazione (ultimi 30 giorni)
Dear Community,
i have a problem regarding the parallel training of rl Agents.
Description:
I'm initializing e.g. 1x100 rlDQNAgent as agenttrain with different parameter settings. They are all trained with the same trainingoptions in the same environment. The compressed version of the parallel training looks like this:
agentoutput = agenttrain;
parfor i = 1:100
out(i) = train(agenttrain(i),env,trainingOptions);
agentoutput(i) = agenttrain(i);
end
I'm initializing agentoutput in the parfor loop to get the changes in the network from every rlDQNAgent. When running this e.g. on 60 parallel workers, there's no problem. If i increase the number of agents (from 100 to 1000) i got the following error message:
During array expansion:
No default is defined for class 'rl.agent.rlDQNAgent'.
Method 'getDefaultScalarElement' in superclass rl.policy.AbstractPolicy is missing or
incorrectly defined.
Do you have any ideas, why this error just occures when the number of agents is higher?

Risposta accettata

Florian Rosner
Florian Rosner il 6 Ago 2021
Based on a support request i could circumvent this issue with a workaround.
By using cell arrays the parfor loop works now:
parfor i = 1:numag
c1{i} = train(agenttrain(i),env,trainingOptions);
c2{i} = agenttrain(i);
end
out = [c1{:}];
agentoutput = [c2{:}];

Più risposte (0)

Categorie

Scopri di più su Parallel and Cloud in Help Center e File Exchange

Prodotti


Release

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by