Why does TreeBagger in Matlab 2014a/b only use few workers from a parallel pool?

Question

Dylan Muir il 5 Dic 2014

0
Link

Link diretto a questa domanda

https://it.mathworks.com/matlabcentral/answers/165573-why-does-treebagger-in-matlab-2014a-b-only-use-few-workers-from-a-parallel-pool

Commentato: Ilya il 9 Dic 2014

I'm using the TreeBagger class provided by Matlab (R2014a&b), in conjunction with the distributed computing toolbox. I have a local cluster running, with 30 workers, on a Windows 7 machine with 40 cores.

I call the TreeBagger constructor to generate a regression forest (an ensemble containing 32 trees), passing an options structure with 'UseParallel' set to 'always'.

However, TreeBagger seems to only make use of 8 or so workers, out of the 30 available (judging by CPU usage per process, observed using the Task Manager). When I try to test the pool with a simple parfor loop:

 parfor i=1:30
    a = fft(rand(20000));
 end

Then all 30 workers are engaged.

My question is: (How) can I force TreeBagger to use all available resources?

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

Accedi per rispondere a questa domanda.

Answer 1

Ilya il 5 Dic 2014

0
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/165573-why-does-treebagger-in-matlab-2014a-b-only-use-few-workers-from-a-parallel-pool#answer_161408

TreeBagger does not limit the number of used cores in any way. Everything is set by your parpool configuration.

The answer may be in the data you pass to TreeBagger. Make sure all trees in the returned TreeBagger object are deep (which means training did take place). If it takes little time to grow these 32 trees, increase the number of trees and see if the load changes.

4 Commenti
Mostra 2 commenti meno recentiNascondi 2 commenti meno recenti

Dylan Muir il 9 Dic 2014

As far as I can tell, training has taken place: each tree contains a long list of conditions.

There seems to be a performance issue with running TreeBagger on a parallel pool. TreeBagger internally uses "internal.stats.parallel.smartForSliceout" to automatically run a nested function "TreeBagger>localGrowTrees>loopbody". If I modify the TreeBagger code to call parfor directly, while incorporating the lines from "internal.stats.parallel.smartForSliceout" and from "TreeBagger>localGrowTrees>loopbody", then the speed of the training step doubles with the same parallel configuration.

Ilya il 9 Dic 2014

Any help I could provide from this point on would depend on various technical details such as the size of your data respective to the memory on the head node, size of trees and exact parpool configuration, to name a few. If you are content with this solution, use it. Otherwise please get in touch with the MathWorks tech support and work with them to make reproducible steps.

Keep in mind that a speed-up or slow-down you observe for one dataset does not necessarily hold for a different dataset. The data size and the average size of grown trees would be factors. It's possible your data are fairly small and so dispatching to smartForSliceout gives a noticeable overhead. But I don't want to hypothesize too much.

Accedi per commentare.

Why does TreeBagger in Matlab 2014a/b only use few workers from a parallel pool?

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposte (1)

4 Commenti
Mostra 2 commenti meno recentiNascondi 2 commenti meno recenti

Vedere anche

Categorie

Tag

Prodotti

Community Treasure Hunt

Why does TreeBagger in Matlab 2014a/b only use few workers from a parallel pool?

0 Commenti Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposte (1)

4 Commenti Mostra 2 commenti meno recentiNascondi 2 commenti meno recenti

Vedere anche

Categorie

Tag

Prodotti

Community Treasure Hunt

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

4 Commenti
Mostra 2 commenti meno recentiNascondi 2 commenti meno recenti