Using the Parallel Computing Toolbox for simultaneous evaluation of multiple Datasets
4 views (last 30 days)
Michael Werthmann on 9 Jun 2021
I am very new to the parallel toolbox and don't know how to run my code with it efficiently. I'll explain my situation.
I want to evaluate different results of image registration on different datasets, which are all stored in a common folder.
Right now I simply loop through the folder, and apply my function "Evaluation(x,y)" to every file in the folder serially, and save the result into a structure. In the end I add all these structures to one big structure since they have the same fields.
Inside my Evaluation function I call different functions, which calculate different things, which some of them are independent of another.
See the pseudo code.
for folder in listing...
%loop through folder and get the paths for the evaluation function
path_fixed = [input_folder_dir_(8).folder,'\', input_folder_dir_(8).name];
res = Evaluation(path_fixed, path_moving, path_registerd, path_displacement, LMs);
res.name = listing(i).name;
struc_name = listing(i).name;
results.(struc_name) = res;
results_overview = [results.strucnameA, results.strucnameB,...]
function res = Evaluation(a,b,c,d)
res.Parameter_a = evaluate_metric1(a,b);
res.Parameter_b = evaluate_metric2(b,c,d);
So as far as I read into parallel computing I have 2 options:
I either can use a parfor loop for looping through the folder. This means that 2-4 datasets are evaluated at the same time. However, I don't know if that works, since both workers would then write the same variable 'res', right?
Otherwise I could use spmd blocks in the evaluation function itself to calculate multiple parameters at the same time.
Which would be the correct way?
Does doubling the workers also mean double the ram memory needed?
Thanks and regards
Thomas Falch on 2 Jul 2021
Using a parfor loop would be a good solution to this problem. Local variables like "res" and "path_fixed" inside parfor loops are not a problem, each worker will have an independent copy. The sturct "results" on the other hand is a problem, it will, in a way, be shared between the workers, and parfor loops cannot write their results to shared structs. They can, however use arrays, and in most cases it is easy to rewrite the code to do so. In your case it would be something like:
parfor i = 1:100
res = i; % Actual computation here
name = sprintf("name_%d", i); % Actual name here
s = struct("Name", name, "Value", res);
a(i) = s;
As for the memory, each worker is really just a regular MATLAB running in the background, so doubeling the number of workers will aproximately double the total memory usage.