Azzera filtri
Azzera filtri

Parpool threads killed when scaling up

17 visualizzazioni (ultimi 30 giorni)
In my script, I have a big custom function in a loop, in each loop this function is called with different parameters. So I try to use parpool('threads') and parfeval to calculate the result ahead of time, and then just retrieve the result in each loop to speed up the total calculation. This is all good, until I increase the size of the matrix in my custom function.
I run Matlab using "matlab -nodisplay". When I increase the size of the matrix slowly, beyond a certain point, I would just see the following:
Killed
and Matlab is quit. How do I know what went wrong? Is the matrix too big and I run out of memory? I verified that the custom function itself runs fine with the big size of matrix.
My computer has 8 cores, when I run parpool('threads'), it gives me a pool of 8 workers. If it's a memory issue, can I reduce the number of workers? Because I do not think I need that many workers to achieve the speed, but I can't find a way to reduce the number of workers when using parpool('thread'). Or should I move to process-based parallel computing, instead of thread-based?
Thanks.

Risposta accettata

Steven Lord
Steven Lord il 29 Ott 2021
What operating system? If you're running on Linux, check if MATLAB got killed by the out of memory killer.
  1 Commento
Chern Hui Lee
Chern Hui Lee il 29 Ott 2021
I'm running on Linux. I followed your link and used the command "grep -i kill /var/log/messages*" but I do not see any file with /var/log/message*. I also read that it can be in /var/log/kern.log, but I'm unable to view the file because I'm only given ssh access through key authentication and do not have the sudo password. The machine only has 32GB of memory, so I think it's likely a memory issue.
After your reply, I transferred the code to another bigger machine with bigger memory in the cluster and I can now run the same calculation on bigger matrices. I will accept this as the answer, thanks for the tips.

Accedi per commentare.

Più risposte (1)

Walter Roberson
Walter Roberson il 29 Ott 2021
Unfortunately, R2020a's parpool('threads') and R2021b's backgroundPool() do not permit you to configure the number of threads to use, so running out of memory is a possibility.
  4 Commenti
Raymond Norris
Raymond Norris il 30 Ott 2021
Notice the following:
>> nt = maxNumCompThreads(2)
nt =
4
>> parpool("threads");
Starting parallel pool (parpool) ...
Connected to the parallel pool (number of workers: 2).
This shows I have 4 cores and that by calling maxNumCompThreads with2, parpool starts a thread-based pool of 2 workers. They caviat -- until you set maxNumCompThreads back to 4, all other code will also only use 2 threads. Then again, you could make the argument that's idea (if not actually setting it to 1), since the parallel pool may now be running on 2 cores. I wonder if a better internal algorhim would be something like
>> new = 3;
>> old = maxNumCompThreads(new);
>> pool = parpool("threads");
Starting parallel pool (parpool) ...
Connected to the parallel pool (number of workers: 3).
>> maxNumCompThreads(max(old-new,1))
This way MATLAB has at least 1 comp thread, but otherwise, the comp threads and the parpool theads equal the total number of cores seen.
Going back to the OP, a local process-based pool won't perform better than a thread-based pool. I emphasize local because certainly you could span a process-based pool across multiple nodes with MATLAB Parallel Server that has access to more memory, etc., whereas a thread-based pool is bound to a single node.
Secondly, I don't see how reducing the number of threads/processes will help. You're justing giving more work/memory to each process. As was seen, the best option is to get onto a machine with more memory.
Chern Hui Lee
Chern Hui Lee il 31 Ott 2021
Thanks, Raymond! This is a cool tips, and it worked for my situation.The reason it worked for me is that I only need 1 or 2 parallel workers to run some calculation ahead of time in the background while I proceed with other steps. This is sufficient to completely eliminate the time needed to run those calculation within my script's main loop. And the more workers I have, the more memory it consumes, hence resulting in process being killed. Now that I reduced the number of workers, I managed to increase the matrix size and still have just enough memory to complete the calculation ahead of time, giving me the speed up I need. Thank you very much!

Accedi per commentare.

Categorie

Scopri di più su Parallel Computing Fundamentals in Help Center e File Exchange

Prodotti


Release

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by