Why are there floating point differences between sum and cumsum functions?

17 visualizzazioni (ultimi 30 giorni)
Why is there a floating point difference between the return from the "sum" function and the last value of the "cumsum" function when using the same data? This behavior was not seen pre-MATLAB R2020b.
Example (where 'data' is an N*1 array of doubles)
>> x = sum (data); >> y = cumsum(data); >> x == y(end)
%Expected result Pre-MATLAB 2020b: 1
%Expected result MATLAB 2020b and beyond: 0

Risposta accettata

MathWorks Support Team
MathWorks Support Team il 27 Giu 2025
Modificato: MathWorks Support Team il 27 Giu 2025
This is the result of an update made to the "sum" function through MATLAB R2020b, which was introduced to improve the performance/speed of the "sum" function. A snippet from the release notes is included below which may be causing the behavior you are seeing:
        "The new algorithm reduces the amount of round-off error in calculations, which leads to a more accurate result in general. Therefore, the output of sum might change slightly in R2020b compared to R2020a when operating on numeric inputs, even though the results between the two versions are numerically equivalent."
The excerpt listed above is the reason that there is a small floating point difference between the "sum" and "cumsum" end results in R2020b and beyond. This difference is caused by precision limitations of floating-point arithmetic. 
Some potential workarounds are listed below:
 
1. Round Answer: 
        The round function allows you to round off to a certain decimal place (ie. to 10 decimal places) which would retain accuracy while helping to cut out the floating-point error. This can be done using the "round" function as follows:
>> y = round(y(end), 10) >> x = round(x, 10))
        Additional documentation for the "round" function can be found here:
 
2. Compare Against a Tolerance:
        A second option that can be used here is to compare these values against a decided tolerance (ie. 1e-10) to account for rounding errors caused by the limitations of floating-point arithmetic. This workflow typically works better when comparing two numbers that are slightly different due to a floating-point rounding error, and can be done as seen below:
>> abs(yy(end) - x)) < 1e-10
        This will return “1” if the values are equal within the tolerance of 1e-10 and “0” otherwise. Note that the tolerance can be adjusted to whatever best helps represent the accuracy of your data.
  4 Commenti
Steven Lord
Steven Lord il 27 Giu 2025
The difference in the round-off behavior of sum was documented in the Release Notes. It was not documented in the Mathematics section of the Release Notes for release R2020b (perhaps it should have been) but it was documented in the Performance section (since the main impact was reduced time for sum; the figure given in the Release Notes was 1.4x faster for a vector with 1e9 elements.) See page 10-37 of the PDF Release Notes document.
I don't remember off the top of my head if we discussed adding this to the Version History on the sum documentation page. I suspect that if we did discuss it when we made this change, we chose not to because the differences are likely to be small in magnitude.
for k = 1:10
data = rand(100,1);
x = sum(data);
y = cumsum(data);
if x ~= y(end)
fprintf("At iteration %d, x is %g and y(end) is %g.\n", k, x, y(end))
fprintf("\tThe difference is %+g.\n", x - y(end))
fprintf("\t eps(x) is %+g.\n", eps(x))
end
end
At iteration 1, x is 53.7079 and y(end) is 53.7079.
The difference is +2.84217e-14.
eps(x) is +7.10543e-15.
At iteration 2, x is 52.3057 and y(end) is 52.3057.
The difference is +7.10543e-15.
eps(x) is +7.10543e-15.
At iteration 5, x is 46.4608 and y(end) is 46.4608.
The difference is +7.10543e-15.
eps(x) is +7.10543e-15.
At iteration 6, x is 49.4412 and y(end) is 49.4412.
The difference is -7.10543e-15.
eps(x) is +7.10543e-15.
At iteration 7, x is 47.4656 and y(end) is 47.4656.
The difference is +1.42109e-14.
eps(x) is +7.10543e-15.
At iteration 8, x is 52.3048 and y(end) is 52.3048.
The difference is +7.10543e-15.
eps(x) is +7.10543e-15.
At iteration 10, x is 51.6658 and y(end) is 51.6658.
The difference is +1.42109e-14.
eps(x) is +7.10543e-15.
Walter Roberson
Walter Roberson il 27 Giu 2025
This was originally posted in 2022, so possibly it was brought up then. The current Answer might just have an updated URL for the double() function, perhaps.
It is frustrating that the online release notes do not go back as far as R2020b. We have to look on page 633 (at present) of https://www.mathworks.com/help/pdf_doc/matlab/rn.pdf where this change is described under
sum Function: Improved performance summing the first dimension of numeric arrays
I can imagine that there might hypothetically have been changes to split the data up between several cores.

Accedi per commentare.

Più risposte (0)

Prodotti


Release

R2020b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by