precision of double variables

9 visualizzazioni (ultimi 30 giorni)
Arda Yigit
Arda Yigit il 17 Giu 2022
Commentato: Walter Roberson il 24 Giu 2022
The code below outputs 1 and 1.1 (and not 0.9):
for ii = -1.1:0.1:1.1
if ii >= 0.9
disp(ii)
end
end
If the first line is replaced as the following:
for ii = -1.2:0.1:1.2
the output becomes 0.9, 1, 1.1 and 1.2.
The difference betwwen the actual value of the variable ii and the expected value is epsilon=2.2204e-16.
Does anyone have any idea why replacing 1.1 by 1.2 causes the behavior?

Risposta accettata

Walter Roberson
Walter Roberson il 18 Giu 2022
Suppose you were using decimal to 4 decimal places.
1/3 == 0.3333
1/3 + 1/3 = 0.3333 + 0.3333 = 0.6666
2/3 = 0.6667 because that is the closest representable number to 4 decimal places
Notice that taking the closest representation to 1/3 and adding it to itself does not get you the closest representation to 2/3
You could clearly expand this to any finite number of decimal places. 1/3 is 3 repeated any finite number of times, add it to itself to get 6 repeated that number of times, but 2/3 is 6 repeating ending in 7.
This is thus an inherent problem in using any fixed numeric base to a finite number of places: there will exist some number that rounds up when calculated as a fraction but rounds down through repeated addition.
It happens that MATLAB uses industry standard IEEE 754 binary floating point, which is what is designed into your computer CPU (if you have a GPU then the GPU might be nearly the same but differ on handling very small magnitude numbers near 1e-308)
IEEE 754 binary floating point (and every other binary floating point) has the property that 1/10 cannot be exactly represented. This is for the same mathematical reasons that decimal cannot exactly represent 1/3 or 1/7 or 1/11: every system that represents numbers with finite precision in a fixed numeric base has the same deficiency for some numbers.
So 0.1+0.1 in binary floating point is not guaranteed to be exactly the same as starting with 0.2 directly.
The next thing you need to know is that "for" loops operate by cumulative addition. First value plus increment. Add the increment again. Add the increment again. And so on.
"for" loops do not work by taking the starting point and multiply the increment by the number of iterations and add to the base: that has slightly different effects on accumulated loss of precision.

Più risposte (2)

David Goodmanson
David Goodmanson il 17 Giu 2022
Modificato: David Goodmanson il 17 Giu 2022
Hi Arda,
you pretty much answered the question by mentioning precision. Most floating point numbers are not represented exactly in memory. That includes most rational numbers such as .9. So you can't always expect the floating point values in the for loop to exactly agree with a value established in some other way. In the following example, you would at least expect the two would-be values of .9 to agree with each other:
% save the values of ii using concatenation (not recommended but the list is small here)
iivals = [];
for ii = -1.1:0.1:1.1
iivals = [iivals ii];
end
iivals(21) - .9 % 21st value should be .9
ans = -1.1102e-16
a = (-1.1:0.1:1.1);
a(21) - .9
ans = 1.1102e-16
but they don't, and neither equals Matlab's best approximation to .9. The for loop value of ii is too small, so .9 doesn't get displayed. (Doing the 1.2 case will show that the value of ii is just over the line so .9 does get displayed).
To see how these values are stored in memory (Matlab uses the IEEE754 standard) you can use format hex:
format hex
iivals(21)
a(21)
.9
ans = 3feccccccccccccc % too small
ans = 3fecccccccccccce % too large
ans = 3feccccccccccccd
The fix is pretty simple: don't use floating point values in an indexing situation. For example, something like
for ii = -11:11
x = ii/10 % use this value to calculate stuff
if ii >= 9 % use ii for equality-type checks
end
avoids a lot of problems.
  20 Commenti
Stephen23
Stephen23 il 23 Giu 2022
Modificato: Stephen23 il 23 Giu 2022
Hello Paul,
Good points, thank you for your comment.
I think what you term the "mental model" is highly relevant to this topic. Really all code (and by extension, all explanations/mental models of code) are just abstractions of what actually happens deep inside the computer. Thus we must consider this paradigm:
Levels of abstraction suit different purposes (and also incite endless flame wars about whose language rulz)... but none the less, every model must necessarily simplify, which means they are wrong.. But some models are useful, because they make it easier for us understand or do something (example: any map).
MATLAB suits a particular type of work, it is not a low-level language. Not only that, TMW have made it quite clear that they will not give details of the low-level implementation or optimization, with (IMHO) good reason (for the kind of language it is):
All kinds of things can and regularly do change in the optimization. As you wrote, "...I don't care if it does, as long as the end result is the same..."
So... what is the meaning of "same" ?
Over the years I have seen minor tweaks to various functions which changed the output bits just slightly, i.e. the usual binary floating point noise. And then some users would complain, because algorithm B delivered different bits to algorithm A, which they previously used and loved. Did TMW do something wrong? Are the outputs the "same" ? Inasmuch as both algorithms are numerically equally good (or equally bad), returning the same values but slightly different numerically meaningless floating point noise, ... are the values the "same"? From a scientific point of view, they are indistinguishable (error propogation is not just related to numeric computing, but is a well-defined topic that is relevant to all measurements and calculations). Or, in George Box's terms, the difference is not "importantly wrong".
That is my mental model of MATLAB: as a user, my code should not rely on a specific algorithm that delivers specific bits (because these are numerically meaningless noise quadrillions of times smaller than the precision of my data) that TMW have already explicitly stated that they will not document. I find that model useful.
"Are you aware of any cases besides for where [colon] (or (colon) for that matter) in an expression is not functionally equivalent to how [colon] is evaluated on the RHS of an assignment statement?"
They are already functionally equivalent.
I guess you are actually asking about the implentation: I can't think of any other example... but then according to my mental model there is also no reason to assume that RHS assignment necessarily requires an explicit array in memory either: perhaps this year's release does, but that is no reason to assume that next year's relase will not do something else. As long as they are functionally the same, I don't care.
Walter Roberson
Walter Roberson il 24 Giu 2022
The third form of for is defined as doing indexing. It would be against the documentation to use any kind of incremental computation for the third form.

Accedi per commentare.


Jan
Jan il 17 Giu 2022
Modificato: Jan il 17 Giu 2022
Welcome to the world of numerics with limited precision.
These are the expected effects. You observe the value mentioned in the frequently asked question:
If you start at -1.1, you see the deviation at 1.0:
for ii = -1.1:0.1:1.1
if ii >= 1.0
disp(ii)
end
end
1 1.1000
If you start at -1.2 you see it at 0.9, so this is simply shifted by 0.1 .

Prodotti


Release

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by