A parfor loop and random number generator

7 visualizzazioni (ultimi 30 giorni)
Alexey Pospelov
Alexey Pospelov il 22 Ott 2024
Modificato: Matt J il 22 Ott 2024
Hello!
I have a following problem: I have two nested for loops that go through some 2D matrices that contain data structutes with some data. It is possible in principle to replace the "inner" for loop with parfor for the efficacy: the processing of the elements of matrix within the "inner" loop is independent from each other. However, the processing routine includes some stochasticity: drafting from the random distributions, random permutations of the vectors, and so on. All these processes are happening within the external self-made functions that are called from within the "inner" loop. I want to make the whole thing reproducible, so I want to control the rng seeds. To handle this, I pre-generate the seeds before entering the loop. The code is quite bulky and includes a small tree of downstream functions, I'll try to make a pseudocodish extract of it and illustrate the problem.
O_N=100; % Number of iterations of the "outer" loop
I_N=10; % Number of iterations of the "inner" loop
rnd(999); % A "master" seed that makes sure that the table with seeds is always the same
Seeds=randi(1000000,O_N,I_N); % Seeds for the RNG for the mutation
Results(1,1:I_N)=XYZ; % An "initial dataset", sort of irrelevant
for O=1:O_N
local_seeds=Seeds(O,:); % A "slice" of seeds variable, for parfor efficacy
Results_slice=Results(O,:); % Even more important for the efficacy, as the table grows
for I=1:I_N % two alternative loops
% parfor I=1:I_N % two alternative loops
Running_results(I)=external_function(Results_slice(I),local_seeds(I));
end
Results(O+1,1:I_N)=Running_results;
end
So, basically, every iteration of the "inner" loop, the results of the previous iteration are used as an input for the external_function(), together with the pregenerated seed. The rng(local_seed) line is pretty much the first one within the external_function().
After some code arrangement, it all boiled down to the code in which I can comment/incomment a single line: it is either parfor I=1:I_N or for I=1:I_N launching the "inner" loop.
And now the problem: when I run it with parfor I=1:I_N, I get some result. This result is perfectly reproducible: all the seed values as well as the output of the external_function and, therefore the final Results table are identical on the repetitive runs of the script with the parfor loop. The same is true for the version with the for loop. But the results with for loop and parfor loop are different from each other despite same functions receive the same seeds as the input.
What is worse: to my undersatnding, if I have the Results table and all the seeds, in theory I should be able to rerun the external_function for any place in the middle of the tables just by providing the indices of the input. But when I do it, the result is not the same as during the parfor loop. To me it looks like the rngs of the independent processes within the parfor interact somehow, although from the manuals I had an impresion that it should not happen.
Could you please help me with figuring out what is going on and what am I doing wrong. Again, my idea (and I have high hopes for it) is that if I set the rngs correctly, than it should not matter from where I call the external_function: from within the for loop, within the parfor loop, or as a standalone command. But so far looks like this is not the case. Can I control it somehow?
Thank you
  2 Commenti
Alexey Pospelov
Alexey Pospelov il 22 Ott 2024
Modificato: Alexey Pospelov il 22 Ott 2024
Okey, I also have a way better and simpler example that you can actually run:
function test_for_parfor
rng(999)
seeds=randi(1000000,10,1);
%for I=1:10
parfor I=1:10
output(I,:)=external_generator(seeds(I));
end
output
end
function out=external_generator(seed)
rng(seed)
out=randi(1000,10,1);
end
This system of two functions makes a table of random integers. It is reproducible provided that the loop you use is the same. But if you comment parfor and uncomment for, the outputs will be different, despite the indexation and rng setting seems to be disambiguous. Why is it so and how do I defeat it?)

Accedi per commentare.

Risposte (2)

Steven Lord
Steven Lord il 22 Ott 2024
See this documentation page for an explanation and a suggested approach to do what you want.

Matt J
Matt J il 22 Ott 2024
Modificato: Matt J il 22 Ott 2024
Use RandStream to give each worker a common random number stream.
test_for_parfor(0)
output =
60 291 379 808 778 839 424 68 206 18
934 819 847 33 255 585 395 626 169 471
462 760 599 64 375 758 631 171 549 81
761 30 696 17 859 394 451 655 524 356
675 483 556 353 84 670 106 325 331 998
726 88 670 45 487 242 606 835 955 490
944 664 125 701 267 761 870 397 488 603
726 250 34 948 508 436 489 291 333 446
105 523 71 779 886 728 233 674 577 911
616 842 813 721 18 551 471 782 598 296
test_for_parfor(inf)
output =
60 291 379 808 778 839 424 68 206 18
934 819 847 33 255 585 395 626 169 471
462 760 599 64 375 758 631 171 549 81
761 30 696 17 859 394 451 655 524 356
675 483 556 353 84 670 106 325 331 998
726 88 670 45 487 242 606 835 955 490
944 664 125 701 267 761 870 397 488 603
726 250 34 948 508 436 489 291 333 446
105 523 71 779 886 728 233 674 577 911
616 842 813 721 18 551 471 782 598 296
function test_for_parfor(M)
rng(999)
seeds=randi(1000000,10,1);
parfor (I=1:10,M)
RandStream.setGlobalStream( RandStream('mt19937ar') );
output(I,:)=external_generator(seeds(I));
end
output
end
function out=external_generator(seed)
rng(seed);
out=randi(1000,1,10);
end

Categorie

Scopri di più su Loops and Conditional Statements in Help Center e File Exchange

Prodotti


Release

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by