When does what License get reserved when using MATLAB Parallel Server?

4 visualizzazioni (ultimi 30 giorni)
I have the following scripts:
% TestScript.m
num_hellos=140;
cluster=parcluster('CPUProfile');
job=createCommunicatingJob(cluster,'Type','pool');
job.NumWorkersRange=[num_hellos num_hellos];
createTask(job,@Hello,0,{},'CaptureDiary',true);
submit(job);
%Hello.m
function Hello()
spmd
fprintf('Hello from lab %d\n',labindex);
end
end
Here "CPUCluseter" is an HPC cluster with a PBS job scheduling system. The Cluster profile is created with the Parallel Computing Toolbox plugin for MATLAB Parallel Server with PBS. The cluster itself has 160 MATLAB_Distrib_Comp_Engine seats, and a limited number of MATLAB license seats. I have tested "TestScript.m" with the command 'matlab -batch "TestSctipt"' from a login node, and this code by itself will schedule a job through PBS that runs 140 workers which will run my "Hello.m" function after PBS allocates me some nodes.
My questions lie in, what is the exact sequence of when licenses are reserved and when the job gets scheduled and when the job runs? From my testing, it SEEMS like the PBS job won't be scheduled until there are enough MATLAB_Distrib_Comp_Engine available, is that corract? While my matlab command is waiting for MATLAB_Distrib_Comp_Engine seats to become available, is it using up one of the main MATLAB licenses? What about while the job is scheduled through PBS, but before the nodes get allocated? What about after the nodes are allocated and the function Hello is running on them--the worker nodes "inherit" the client license when they're launched, does that mean that start taking a main license seat at that point? If no seats are available when nodes are allocated, will the workers fail?

Risposta accettata

Raymond Norris
Raymond Norris il 17 Ago 2022
My questions lie in, what is the exact sequence of when licenses are reserved and when the job gets scheduled and when the job runs?
  • From my testing, it SEEMS like the PBS job won't be scheduled until there are enough MATLAB_Distrib_Comp_Engine available, is that corract?
We'd have to see your PBS job script, but likely, PBS doesn't know that you need any MATLAB_Distrib_Comp_Engine licenses and will run once there are enough resources (cores, memory, etc.).
  • While my matlab command is waiting for MATLAB_Distrib_Comp_Engine seats to become available, is it using up one of the main MATLAB licenses?
MATLAB submits the job to PBS via parpool (synchronous) or batch (asynchronou). If your code calls parpool, then yes, you'll continue to checkout a MATLAB license. If you're using batch, MATLAB could possibly finishing and therefore return the MATLAB license before the whatever batch calls is finished.
  • What about while the job is scheduled through PBS, but before the nodes get allocated?
It's unclear which job you mean here. The one you submitted, or the one MATLAB submitted. When you submit the job, MATLAB will checkout a license once PBS launches MATLAB. The MATLAB_Distrib_Comp_Engine license(s) will get checked out once the job that parpool/batch starts.
  • What about after the nodes are allocated and the function Hello is running on them--the worker nodes "inherit" the client license when they're launched, does that mean that start taking a main license seat at that point?
The workers dynamically unlock toolboxes (slightly different than checking out in the licenses traditional sense) that the user is entitled to, based on the MATLAB license used to submit the job to begin with. The workers don't take any licenses (other than MATLAB_Distrib_Comp_Engine)
  • If no seats are available when nodes are allocated, will the workers fail?
Yes.
  2 Commenti
Frank Moore-Clingenpeel
Frank Moore-Clingenpeel il 17 Ago 2022
Modificato: Frank Moore-Clingenpeel il 17 Ago 2022
There is no PBS job script. There are "Wrapper" shell scripts, provided with the plugin I linked in the OP, but I have not modified them for my installation. As far as I can tell, they all call mpiexec on <matlabroot>/bin/worker to launch workers. I don't think the main MATLAB executable is launched on any compute node, but I could be missing something in the scripts.
There are also no calls to batch or to parpool in anywhere. All of the code I posted above will result in a PBS job being launch by MATLAB and run on the cluster. I will edit my post to make that more clear.
I never run pbs from the command line with this setup--as I understand from analyzing the code, MATLAB issues PBS commands so it can stand-up a Parallel Server within a PBS job, via the wrapper shell scripts.
Raymond Norris
Raymond Norris il 17 Ago 2022
Got it. So you're running MATLAB on your desktop. Correct, in that case, there's no MATLAB process running on your HPC cluster, just worker processes. The way you've written your code (jobs/tasks) is akin to batch (actually, it's the other way around -- batch uses jobs/tasks).
  • From my testing, it SEEMS like the PBS job won't be scheduled until there are enough MATLAB_Distrib_Comp_Engine available, is that corract?
No. For all intents and purpose, PBS will run once there are enough resources (cores, memory, etc.).
  • While my matlab command is waiting for MATLAB_Distrib_Comp_Engine seats to become available, is it using up one of the main MATLAB licenses?
MATLAB isn't waiting for your job to run. After you call
submit(job)
you could quite MATLAB.
  • What about while the job is scheduled through PBS, but before the nodes get allocated?
MATLAB continues to be checked out if it's running. But as stated, because you're using jobs/tasks, you can quite MATLAB once you've submitted your job.
  • What about after the nodes are allocated and the function Hello is running on them--the worker nodes "inherit" the client license when they're launched, does that mean that start taking a main license seat at that point?
MATLAB_Distrib_Comp_Engine licenses will be checked out once your PBS job is running.
  • If no seats are available when nodes are allocated, will the workers fail?
Yes

Accedi per commentare.

Più risposte (0)

Categorie

Scopri di più su Third-Party Cluster Configuration in Help Center e File Exchange

Prodotti


Release

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by