Running parfor on SLURM limits cores to 1

12 visualizzazioni (ultimi 30 giorni)
Hello, I'm trying to run some parallelized code (through parfor) on a university high performance cluster. In order to make sure parallelization is working correctly, I set up a single node with 32 cores via "srun --pty -t 00:30:00 -n 32 -N 1 /bin/bash -l", which I verify does start an interactive session with 32 cores assigned. I then can start matlab from the command line as normal
matlab -nodisplay -nosplash
But when I try to initialize the parallel pool I see
>>poolobj = parpool(32);
Starting parallel pool (parpool) using the 'local' profile ...
Error using parpool (line 151)
You requested a minimum of 32 workers, but the cluster "local" has the
NumWorkers property set to allow a maximum of 1 workers. To run a communicating
job on more workers than this (up to a maximum of 512 for the Local cluster), increase the value of
the NumWorkers property for the cluster. The default value of NumWorkers for a Local cluster is
the number of physical cores on the local machine.
Checking the number of cores shows only a single core has been assigned.
>>feature('numcores')
MATLAB detected: 32 physical cores.
MATLAB detected: 32 logical cores.
MATLAB was assigned: 1 logical cores by the OS.
MATLAB is using: 1 logical cores.
MATLAB is not using all logical cores because Operating System restricted the number of cores to: 1.
I reached out to the cluster administrator, but they suggested the problem was on MATLAB's side, specifically perhaps needing to change some configuration to allow more than 1 core to be used, but I am not sure exactly how to do so - I see no way to edit for instance the parallel settings via the command line. I'm not familiar with running parallel MATLAB on non-local resources, so would appreciate any insight on how I could resolve these issues, or if there is a better way to setup/submit such jobs.

Risposta accettata

Raymond Norris
Raymond Norris il 22 Ago 2022
Try changing
-n 32
for
-c 32
-n is tasks, but with 1 CPU per task (by default). It's possible that cgroups is telling MATLAB it only has 1 CPU. -c is CPUs per task. Setting this to 32 might tell MATLAB it's been assigned 32 cores, allowing a local pool of 32 workers.

Più risposte (0)

Categorie

Scopri di più su Parallel Computing Fundamentals in Help Center e File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by