matlab strange result for mean of single vs double

7 visualizzazioni (ultimi 30 giorni)

Herve Hugonnet il 2 Nov 2020

4
Link

Link diretto a questa domanda

https://it.mathworks.com/matlabcentral/answers/633539-matlab-strange-result-for-mean-of-single-vs-double

Modificato: Matt J il 16 Gen 2024

Hello,

I get vastly different result when avraging an array using single vs double precision. Here is a code to reproduce my issue

a=single([99 100 65 101]'+zeros(4,512,512,128)); 
display(' single mean : '); 
display([num2str(mean(a,[2 3 4]))]);
display( ' double mean : ');
display(num2str(mean(double(a),[2 3 4])));

The output on matlab R2020a and R2020b is :

 single mean : 
64
64
64
64
 double mean : 
 99
100
 65
101

I would expect small difference between single and double due to precision error but not that big.

I could also expect the value to become out of range if using the sum operator but this should not occure with a well implemented mean function .

This also only occures if the first dismention is not one; else i have no problem :

for exemple if using

display([num2str(mean(a(1,:,:,:),[2 3 4]))]);

I get the ouput :

which is the expected output

ps: i have already read that post : https://www.mathworks.com/matlabcentral/answers/5401-matlab-function-mean-returns-the-exact-same-value-for-uint16-and-double-values-not-for-single but the problem seems different (in the above post only a small difference can be seen which is explained by precision error while in my case results are not even similar)

Thank you

Herve

26 Commenti
Mostra 24 commenti meno recentiNascondi 24 commenti meno recenti

Paul il 12 Gen 2024

Apri in MATLAB Online

a=single([99 100 65 101]'+zeros(4,512,512,128));

mean is implemented in an m-file

which mean(a,[2 3 4])
/MATLAB/toolbox/matlab/datafun/mean.m

For this use case, I believe the observation arises from the call to sum on line 125 (which is line 127 in 2021b for whatever that's worth, though 2021b exhibits the same behavior) with flag being 'default'

dbtype /MATLAB/toolbox/matlab/datafun/mean.m 118:130
     if omitnan     
         % Compute sum and number of NaNs
         m = sum(x, dim, flag, 'omitnan');
         nr_nonnan = mysize(x, dim) - matlab.internal.math.countnan(x, dim);
         % Divide by the number of non-NaNs.
         y = m ./ nr_nonnan;
     else
         y = sum(x, dim, flag) ./ mysize(x,dim);
     end
 end
     
 end
 

All of the elements of a are positive

all(a(:)>0)
ans = logical
   1

and realmax for a single is quite large

format long e
realmax('single')
ans = single
   3.4028235e+38

Callling sum over the specified dimensions yields different results

[sum(a,[2 3 4],'default') sum(double(a),[2 3 4],'default')]
ans = 4×2
0e+00 *

1474836e+09   3.3218888e+09
1474836e+09   3.3554432e+09
1474836e+09   2.1810381e+09
1474836e+09   3.3889976e+09

but summing over all dimensions yields the same result as @Bruno Luong noted three years ago

[sum(a,[1 2 3 4],'default') sum(double(a),[1 2 3 4],'default')]
ans = 1×2
1.0e+00 *

   1.2247368e+10   1.2247368e+10
isequal(ans(1),ans(2))
ans = logical
   1

What could be "the operation being done" for which single does not have enough range or precision to sum the elements of a subset of 'a', all of which are postive, and where the sum of the whole 'a' matrix is less than realmax and gives the correct result for single or double?

"Should power() be made to do overflow pre-checks and pre-convert its input to doubles?"

Not in my opinion.

Paul il 15 Gen 2024

Apri in MATLAB Online

it is easy to imagine that discontiguous data might not be partitioned in chunks.

I'd like to pull this thread further, but using sum because I think that's where the problem lies

B = single(99)*ones(2, 512*512*128,'single');

The "correct" answer for the sum of each row is

format long
dsum = 99*512*512*128
dsum = 
     3.321888768000000e+09

Summing both rows of discontiguous data

Bsum = sum(B,2)
Bsum = 2×1
1.0e+09 *

   2.1474836
   2.1474836

yields the exact same result as a naive, non-chunked loop

ssum = single(0); ninetynine = single(99);
for ii = 1:(512*512*128)
    ssum = ssum + ninetynine;
end
isequal(Bsum(1),ssum)
ans = logical
   1

But summing just one row of B, which is discontiguous, yields the correct result

Bsum = sum(B(1,:));
isequal(Bsum,dsum)
ans = logical
   1

Does this imply that B(1,:) is copied to contiguous memory and then summed?

My mental model of sum(B,2) is

Bsum = zeros(2,1,'single');
for ii = 1:2
    Bsum(ii) = sum(B(ii,:));
end
Bsum
Bsum = 2×1
1.0e+09 *

   3.3218887
   3.3218887

I guess my mental model needs to be adjusted.

Matt J il 15 Gen 2024

Modificato: Matt J il 15 Gen 2024

Does this imply that B(1,:) is copied to contiguous memory and then summed?

Yes, extracting a subset of a matrix through indexing, e.g., B(i,:) or B(:,j), always creates a copy (except possibly when the subset is just a scalar). And, because in this case B(1,:) is just a vector, it will be contiguous.

Accedi per commentare.

Accedi per rispondere a questa domanda.

Risposte (3)

John D'Errico il 12 Gen 2024

3
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/633539-matlab-strange-result-for-mean-of-single-vs-double#answer_1388201

Apri in MATLAB Online

You need to understand what you did, AND you need to understand floating point arithmetic.

a=single([99 100 65 101]'+zeros(4,512,512,128));
mean(a,[2 3 4])
ans = 4×1
    64
    64
    64
    64

So what happened there? What is the mean? A mean is just a sum of the list of numbers, divided by the number of elements in that list.

asum = sum(a,[2 3 4])
asum = 4×1
0e+09 *

1475
1475
1475
1475

You can see here that the sum is the SAME, for each of those cases. And that cannot be right, unless you understand what happened. When you compute those sums in single precision, at some point, we have a number where if you add a number on the order of 100 to it, nothing changes.

asum + 100 == asum
ans = 4×1 logical array
   1
   1
   1
   1

Do you see that the accumulation in the sum no longer works? We cannot increase those values, essentially an overflow condition. And that means we get your result.

asum/prod([512,512,128])
ans = 4×1
    64
    64
    64
    64

How do you fix this? First, I would ask why you are doing this in single precision. The simple solution is to do as @Matt J has suggested. Either compute the mean in multiple steps, or convert to double for the computation.

amean1 = mean(mean(mean(a,2),3),4)
amean1 = 4×1
    99
   100
    65
   101
amean2 = single(mean(double(a),[2 3 4]))
amean2 = 4×1
    99
   100
    65
   101

7 Commenti
Mostra 5 commenti meno recentiNascondi 5 commenti meno recenti

Herve Hugonnet il 15 Gen 2024

@Paul I am sorry but i did not report this to the tech support only uplaoded it here

Herve Hugonnet il 15 Gen 2024

@Matt J thanks for the tip regarding ,'double'

Accedi per commentare.

Matt J il 3 Nov 2020

1
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/633539-matlab-strange-result-for-mean-of-single-vs-double#answer_532679

Apri in MATLAB Online

This is fine as workaround...mean(mean(mean(a,2),3),4)

Another workaround is to force the mean calculation in double type,

a=single([99 100 65 101]'+zeros(4,512,512,128)); 
mean(a,[2,3,4],'double')

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

Matt J il 16 Gen 2024

1
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/633539-matlab-strange-result-for-mean-of-single-vs-double#answer_1390931

Modificato: Matt J il 16 Gen 2024

Apri in MATLAB Online

I reported the alternative example below to Tech Support and their reply was that they do in fact consider it a bug. I'm not sure from the reply I got if the bug is the occurence of precision underflow itself, or if the bug is just the inconsistency in the result when taking means/sums across different dimensions.

B=99*ones(2, 512*512*128,'single');
mean(B,2) %wrong
ans = 2×1
    64
    64
mean(B,2,'double') %right
ans = 2×1
    99
    99
mean(B',1)' %right?
ans = 2×1
    99
    99