Script hangs for a long time after sparse matrix allocation

1 visualizzazione (ultimi 30 giorni)
I wrote the MATLAB script that solves the Laplace equation with oblique boundary conditions using Boundary Element Method (BEM). Everything worked surprisingly smooth until I decided to modify this code to use sparse matrixes.
I decided not to initialize the matrix via triplets but instead to use spalloc. As far as I know, triplets have to be stored in RAM alongside a created sparse matrix, which was impossible due to the size of the problem.
My code for assembling this matrix is as follows :
M = spalloc(sz,sz,elements_sum);
mod_parfor_progress(sz);
parfor m=1:sz
[row,ind] = whole_Mrow(mesh_generated.Points,mesh_generated.node_panels,m,far_zone_dist,sz);
M(m,:) = sparse(1,ind,row,1,sz);
row = [];
ind =[];
mod_parfor_progress;
end
mod_parfor_progress(0); %<-If problem occurs this line is never executed
disp('M matrix done.')
M size is 283 560 x 283 560
In the best case, it has only 11882253152 non-zero elements (I can count them exactly before assembling)
The server used for computation has 512 Gb RAM installed. I was able to run this script when M was dense and had a size M(150 00x150 000).
mod_parfor_progress is a function to track parafor progress (modified parfor_progress written by Jeremy Scheff)
The code was working well as long as M was dense.
However, if M is sparse, this script hangs. I can see from the content of file parafor_progres.txt that loops filled all rows of M, but then everything stuck for a long time ( more than two times longer than spent on the loop execution, [76 hours !]).
Could you explain to me what MATLAB is doing? My assumption is that there is some sort of memory defragmentation, which is inefficient in the case of such a big matrix.
I did a memory RAM usage test using small logger written in bash, but results puzzles me even more. The memory usage tends to explode (+200Gb and more) after assembling loop ends.
Summing up:
1) The parafor_progress suggests that M matrix is assembled; however, nothing after the loop is executed.
2) The htop results confirm that all workers have been terminated. Only one process of Matlab is running at 100% cpu usage (the single core of CPU).
3) The memory usage tends to fluctuate a lot, but this is after M was technically assembled. In some cases whole script is terminated for this reason.
I hope that I have provide enough details about the problem, but any suggestions for further debugging are welcome.

Risposte (1)

arushi
arushi il 16 Gen 2024
Hi Krzysztof,
I understand that you are facing the issue of the script not able to handle sparse matrices. The behavior you're observing suggests that MATLAB is likely performing some form of memory management or garbage collection after the parallel loop completes. When working with such large sparse matrices, MATLAB may need to reorganize the data in memory to optimize access and storage, which can be a very resource-intensive process. Here's a breakdown of what might be happening:
  1. Memory Defragmentation: After the parfor loop fills the sparse matrix M, it may be attempting to consolidate the memory used by the matrix. This defragmentation process can be very time-consuming, especially for a matrix of the size in this case.
  2. Garbage Collection: It automatically manages memory with a garbage collector that deallocates memory that is no longer in use. After the parfor loop, the garbage collector might be working to clean up temporary variables and other data that were created during the loop's execution.
  3. Sparse Matrix Reorganization: It might be reorganizing the sparse matrix data to optimize it for future operations. This process involves merging the data from the individual sparse row vectors created in each iteration of the parfor loop into the larger matrix M.
  4. Single Thread Execution: After the parfor loop, the code execution reverts to a single thread, which can make subsequent operations appear to hang, especially if they are resource-intensive.
Here are some suggestions to potentially alleviate the issue:
  • Increase Verbosity : Add more logging inside and outside theparfor loop to pinpoint exactly where the code is hanging.
  • Memory Monitoring: Use MATLAB's built-in memory monitoring functions, such as memory or profile, to get more detailed information about memory usage and allocation during the execution of your script.
  • Optimize Sparse Matrix Creation: Instead of creating individual sparse row vectors and then assigning them to M, consider accumulating the row, column, and value triplets in separate arrays within the parfor loop and then creating the sparse matrix in one go after the loop. This can sometimes be more efficient than modifying the sparse matrix in each iteration.
  • Reduce Overhead: Minimize the number of operations inside the parfor loop to reduce overhead. For example, consider removing the mod_parfor_progress calls if they are not essential.
  • Batch Processing: If possible, split the matrix assembly into smaller batches that can be processed independently and then combined. This can help manage memory usage and make the process more manageable.
Link to the documentation of memory function - https://www.mathworks.com/help/matlab/ref/memory.html
Link to the documentation of profile function -https://www.mathworks.com/help/matlab/ref/profile.html
Hope this helps.

Prodotti


Release

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by