Parallel coding does not work in LINUX (Red Hat)

6 visualizzazioni (ultimi 30 giorni)
Tzu-Chang Cheng
Tzu-Chang Cheng il 22 Set 2012
Hi,
I have a simple code for doing parallel computing for a program I hope to run it under a huge number of loops. I made the code in my laptop with Matlab(R2011b). After it worked in my laptop, I moved it to a small lab with a cluster composed of 24 cores, and installed Matlab R2012a. It does not work there. However I change the loop number, the program executes within 9 seconds and returns empty cell.
I am very new to the LINUX cluster, which run by some tech staff who is not familiar with MATLAB. Could you point out possible reason the program fails in the LINUX?
Appreciated.
My code is attached as below
sim=findResource();
%name for jobmanager; I use local here but not sure correct setup
tic
%creat a job object;
job=createJob(sim, 'FileDependencies', {'/home/electricity/data_analysis/matlab_code'});
cd('/home/electricity/data_analysis/matlab_code/Result'); %change directory where the result saved
loop_num=5000; %parameter input into function named simulation_mac
rep=1;
% distribute jobs;
for i=1:rep;
jobtask(i)=createTask(job, @simulation_mac, 1, {loop_num}); % indicate that simu function has 1 output,
end;
submit(job);
waitForState(job, 'finished');
%Retrieve the resutls. results is a M-by-N cell array in which
%row m contains the output elements for each task and each column n
%corresponds to each output argument requested from task evaluation.
result = getAllOutputArguments(job);
toc;
mse=cell2mat(result);
save parallel_result mse;
destroy(job);

Risposte (1)

Jason Ross
Jason Ross il 24 Set 2012
Modificato: Walter Roberson il 24 Set 2012
It sounds like there are a few things you need to do.
  1. The client and cluster versions of MATLAB must be the same. So if you are using 2011a on your laptop and 2012a on the cluster, those two cannot talk to one another. Note that the code is fine to go from 2011a to 2012a -- it's just that if you want your laptop to talk to the cluster, you need to upgrade.
  2. If you have installed MATLAB and MDCS on the cluster, you need to integrate it with the existing cluster, or set up the Job Manager. For 2012a, that process is documented here: http://www.mathworks.com/support/product/DM/installation/ver_current/instructions/mdce_install.pdf, or installation support is available free of charge.
  3. Are you intending to use Simulink? When you use the "sim" command, that simulates a Simulink model. Your commands could be colliding on the cluster, where Simulink may be installed but not installed on your laptop?
  4. When you use the "local" cluster, that is utilizing your local machine resources as a "cluster" -- if you look at what is running, you'll see a number of MATLAB processes start as your workers. If you want to connect to the cluster (assuming MDCS is set up), you will use the same findResource call, but it will look like
scheduler=findResource('scheduler','configuration','jobmanager')
Where the "jobmanager" would refer to a configuration you have defined in the Parallel menu to connect to the job manager configured on the Linux cluster (or another scheduler that might already be there -- LSF, Torque, PBS, etc). The remaining syntax would be the same if you are running on "local" or on the cluster.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by