matrix inversion doesn't fully utilize cpu
Mostra commenti meno recenti
Hi all,
I have a 16 core CPU that I've been trying to run some FDM CFD code on. The algorithm involves 3 matrix inversions which take the most time, and I have already decomposed them, since we only need to make the matrices once. They are also sparse matrices. I assume the code should parallelize the inversion efficiently, but when I run the code, the CPU is never fully utilized, and what's surprising is that if I increase the number of grid points (make the matrix larger), the CPU utilization actually goes down. There seems to be a specific size of a matrix that is the most efficient to invert.
For example, here are some data points.
20000x20000 matrix - 30% CPU util
10000x10000 matrix - 37% CPU util
5000x5000 matrix - 45% CPU util
2211x2211 matrix- 67% CPU util
1250x1250 matrix - 17% CPU util
800x800 matrix - 22% CPU util
Is this normal, or is there something different I should be doing? I want to note that although there is less utilization at smaller matrices, it is still overall faster than cases with larger matrices. However, I would like my computer to utilize its full potential and run at full utilization for all sized matrices. Thanks!
Best,
Brandon
Risposte (1)
Walter Roberson
il 17 Ott 2022
0 voti
They are also sparse matrices. I assume the code should parallelize the inversion efficiently
No, in general the inverse of a sparse matrix is a dense matrix, so processing a sparse matrix adds overhead and prevents passing regular subsections of the matrix to different cores for processing.
In most cases you should be using the \ operator instead of multiplying by a matrix inverse. You can decompose and use that object with the \ operator without forming the inverse.
4 Commenti
Brandon Li
il 17 Ott 2022
Walter Roberson
il 17 Ott 2022
Are your sparse matrices band-limited, such as being tri-diagonal? There are possibly special routines for that. But in general, sparse matrices are only paralleizable for element-wise operations such as .* or .^ or + or exp(); in some cases they can be faster for * . But for more complicated operations such as \ using sparse cannot readily be done in parallel, since you cannot do efficient block operations when you have to make a decision about whether each element is included in the block or not.
There might possibly be some ways of reducing overheads on recursive block operations... but in a lot of cases, if you can afford the memory, you can get significant speed-ups by working with full() of the matrices.
Walter Roberson
il 18 Ott 2022
Modificato: Walter Roberson
il 18 Ott 2022
It is sort of like the situation with
f = @(x) some long expression involving a division by x
for K = 1 : N
if X(K) == 0
Y(K) = 0;
else
Y(K) = f(X(K));
end
end
compared to
f = @(x) some long expression involving a division by x
Y = f(X);
Y(X == 0) = 0;
in the sense that if you have to test each location to see whether it is eligible for a mathematically correct result, that that can take a lot more time than simply going ahead and fixing up the results afterwards. Likewise, the process of deciding whether each element in a sparse matrix is present can end up being more costly then having used a full matrix in the first place (if you can afford the memory.)
Brandon Li
il 18 Ott 2022
Categorie
Scopri di più su Creating and Concatenating Matrices in Centro assistenza e File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!