Why do Nan values greatly increase the cost of sparse multiplication
92 visualizzazioni (ultimi 30 giorni)
Mostra commenti meno recenti
Jan Neggers
il 22 Ott 2025 alle 6:16
Modificato: Dyuman Joshi
il 22 Ott 2025 alle 14:14
Consider this code:
clear; close all;
n = 1e6;
m = 1e3;
s = 1e-3;
p = round(s^2 * n * m);
I = randperm(n, p);
J = randperm(m, p);
phi = sparse(I, J, 1, n, m);
Gx = randn(n, 1);
for k = 1:3
tic
L = Gx .* phi;
toc
end
Gx(1) = NaN;
for k = 1:3
tic
L = Gx .* phi;
toc
end
Now consider the output on my macbook pro M1 running Matlab 2024a (so through rosetta).
Elapsed time is 0.001591 seconds.
Elapsed time is 0.000291 seconds.
Elapsed time is 0.000282 seconds.
Elapsed time is 0.573762 seconds.
Elapsed time is 0.548177 seconds.
Elapsed time is 0.560880 seconds.
Notice the very large difference in computation times between the no-NaN version of Gx and the version that contains a single NaN value.
The solution is quite obvious, just remove the NaN values with your favorite method (isnan, isfinite, etc.). So this is not my question.
What intregues me is:
why the computation time changes so drastically?
2 Commenti
Dyuman Joshi
il 22 Ott 2025 alle 7:05
Modificato: Dyuman Joshi
il 22 Ott 2025 alle 7:20
n = 1e6;
m = 1e3;
s = 1e-3;
p = round(s^2 * n * m);
I = randperm(n, p);
J = randperm(m, p);
phi = sparse(I, J, 1, n, m);
Gx = randn(n, 1);
timeit(@() Gx.*phi)
Gx(1) = NaN;
timeit(@() Gx.*phi)
%The version glitch is still here
version
The difference is of the same order, 1e3.
Risposta accettata
Matt J
il 22 Ott 2025 alle 11:51
Modificato: Matt J
il 22 Ott 2025 alle 12:10
I think you should report it as a bug. If you need a workaround, though, the following equivalent operation, which avoids implicit exapnsion, seems to work fine,
n = 1e6;
m = 1e3;
s = 1e-3;
p = round(s^2 * n * m);
I = randperm(n, p);
J = randperm(m, p);
phi = sparse(I, J, 1, n, m);
Gx = spdiags(randn(n, 1),0,n,n);
timeit(@() Gx*phi)
Gx(1) = NaN;
timeit(@() Gx*phi)
Più risposte (1)
Steven Lord
il 22 Ott 2025 alle 13:05
Let's look at a sample sparse matrix.
S = sprand(1e4, 1e4, 0.1);
status = @(M) fprintf("The matrix has %d non-zero elements, a density of %g.\n", nnz(M)./[1, numel(M)]);
status(S)
Multiplying S by four different values
Finite non-zero value
If we multiply S by a finite value, the result has the same number of non-zero elements as S.
tic
A1 = S.*2;
toc
status(A1)
2*0 is a 0, and that doesn't need to get explicitly stored in A1.
anynan(A1) % false
Non-finite values
What happens if we multiply S by a non-finite value? Let's first look at multiplying by NaN.
tic
A2 = S.*NaN;
toc
status(A2)
NaN*0 is NaN, and that does get explicitly stored in A2. In fact, all the elements in A2 are NaN.
anynan(A2)
nnz(isnan(A2))
What if we multiply by Inf instead?
tic
A3 = S.*Inf;
toc
status(A3)
Inf*0 is also NaN, and that gets explicitly stored in A3. Not all the elements in A3 are NaN, just the ones that were 0 in S since Inf times a positive finite value is Inf.
anynan(A3)
NaNInA3 = nnz(isnan(A3))
InfInA3 = nnz(isinf(A3))
NaNInA3 + InfInA3 == numel(A3)
We can see that all the locations where A3 is Inf rather than NaN are the locations corresponding to non-zero values in S.
WereAllInfInA3WhereSWasNonzero = all(isinf(A3) == (S~=0), "all")
Zero
And if we multiply S by 0:
tic
A4 = S.*0;
toc
status(A4)
The result has no non-zero elements.
0 times anything finite (0 or non-zero) is 0, and those don't need to get explicitly stored in A4.
anynan(A4) % false
Computing A4 is quicker than even computing A1.
Memory and performance comparison
You can see this by looking at how much memory each matrix consumes. The A2 and A3 sparse matrices are not sparsely populated, but they are stored using sparse storage. One term for this is Big Sparse. They'll actually consume more memory than they would if you'd stored them as full!
F2 = full(A2);
F3 = full(A3);
whos S A1 A2 A3 A4 F2 F3
Roughly, A2 and A3 take 10 times as much memory as either A1 or S. They also take roughly 10 times as long to compute as the computation of A1. And A4 has no non-zero elements so all the memory it consumes is from the stored index information.
2 Commenti
Matt J
il 22 Ott 2025 alle 14:02
hey also take roughly 10 times as long to compute as the computation of A1
Right, but, the OP is seeing a slowdown by a factor of 1e4, not 10. Also, my equivalent implementation does not incur any noticeable speed penalty. So, you can see there must be something more going on than just the memory footprint of the result, right?
Dyuman Joshi
il 22 Ott 2025 alle 14:11
Modificato: Dyuman Joshi
il 22 Ott 2025 alle 14:14
There still the question of why there is a difference of order 1e3.
The matrices generated in OP's code are much sparser than the ones you have used in the code above. Also, the memory taken by L after the operation including NaN is still comparable to the previous, the change being 1x.
n = 1e6;
m = 1e3;
s = 1e-3;
p = round(s^2 * n * m);
I = randperm(n, p);
J = randperm(m, p);
phi = sparse(I, J, 1, n, m);
Gx = randn(n, 1);
for k = 1:3
tic
L = Gx .* phi;
toc
end
whos Gx L
Gx(1) = NaN;
for k = 1:3
tic
L = Gx .* phi;
toc
end
whos Gx L
I wonder how does CPU/multithreading come into the picture with spare operations.
Vedere anche
Categorie
Scopri di più su Logical in Help Center e File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!