Comparison between elements of matrix of different data type

Question

Stewart Tan il 30 Ago 2019

0
Link

Link diretto a questa domanda

https://it.mathworks.com/matlabcentral/answers/478194-comparison-between-elements-of-matrix-of-different-data-type

Commentato: Guillaume il 4 Set 2019

So I recently wrote a few line of code to compare adjacent pairs of a matrix where the values in the matrix are integers:

test_mat = [99 100 54 32 14; 89 4 41 2 3; 87 64 32 19 20];

the matrix i currently have is a matrix of 200,000x5. When i pass the matrix for comparison, it took about roughly 2 minutes to complete the comparison. however, i had another matrix where it contains:

test_mat2 = [0.0482 0.0050 0.0516 0.0063 0.0058; 0.0847 0.0008 0.0071 0.0086 0.0502];

and the one that I'm using is also a 200,000x5 matrix which contains data as the test_mat2 above. I notice that comparison takes way longer time compared to the first matrix of integers. Is there any reasoning behind this? Is comparison more expensive with numbers with decimals?

3 Commenti
Mostra 1 commento meno recenteNascondi 1 commento meno recente

Guillaume il 4 Set 2019

Modificato: Guillaume il 4 Set 2019

Apri in MATLAB Online

When you say that the first matrix is integer, what is its class? In your example it's still a matrix of class double, so floating points values which just happen to be integer. There should be no difference in speed between your two examples matrices unless your comparison algorithm does something very strange.

If your first test matrix is actually of an integer class, eg.:

test_mat = uint8([99 100 54 32 14; 89 4 41 2 3; 87 64 32 19 20])

then yes there could be a difference in speed as per Nikhil's answer due to difference in memory footprint. However, I would be surprised if that was noticeable.

In any case, 2 minutes sounds like a long time, so maybe there is something odd going on with your comparison algorithm. Can you share it?

edit: actually the difference is somehow noticable but still it shouldn't take 2 minutes to compare pairs of numbers:

>> mint = randi([0 255], 2e5, 5, 'uint8');  %create a 200,000 x 5 matrix of integers (uint8)
>> mdouble = double(mint);    %store the same integers in a matrix of class double
>> mdouble2 = mdouble + rand(2e5, 5);  %add a fractional part to show that it doesn't matter if the numbers are integer in a double array
>> timeit(@() mint(1:2:end) == mint(2:2:end))   %compare pair of integers stored as integer
ans =
    0.0021422
    
>> timeit(@() mdouble(1:2:end) == mdouble(2:2:end))  %compare pairs of integers stored as double
ans =
    0.0038102
    
>> timeit(@() mdouble2(1:2:end) == mdouble2(2:2:end))  %compare pairs of non-integers
ans =
    0.0037652

As you can see, whether a double array contains integers or not doesn't matter. However, comparison for integer classes is faster (less bytes to compare)

Jan il 4 Set 2019

Modificato: Jan il 4 Set 2019

Apri in MATLAB Online

@Guillaume: Your tests do not only compare the timing for the comparison, but also for the creation of the vectors. mdouble(1:2:end) needs more time than mint(1:2:end), because it has to allocate and write more bytes.

mint     = randi([0 255], 2e5, 5, 'uint8');
mdouble  = double(mint);
mdouble2 = mdouble + rand(2e5, 5);
timeit(@() mint == mint)
>> 0.000305
timeit(@() mdouble == mdouble)
>> 0.00031
timeit(@() mdouble2 == mdouble2)
>> 0.00031

The UINT8 comparison is cheaper, because for double the comparison NaN==NaN must be treated as an exception. It looks like this is implemented in the CPU already, such that both need the same time.

I'd expect a difference in the timings due to the memory band width, if the data do not match into the processor cache. I've tested this in Matlab online only, so please repeat the test on a real machine.

Guillaume il 4 Set 2019

Apri in MATLAB Online

@Jan, indeed. However, there doesn't appear to be much difference in timing for allocating uint8 or double:

>> timeit(@() randi([0 255], 2e5, 5, 'uint8'))
ans =
     0.012459
>> timeit(@() randi([0 255], 2e5, 5, 'double'))
ans =
     0.01323

Accedi per commentare.

Accedi per rispondere a questa domanda.

Answer 1

Nikhil Sonavane il 4 Set 2019

0
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/478194-comparison-between-elements-of-matrix-of-different-data-type#answer_390346

The way floating points are allocated in the memory is very different as compared to integers. Hence, the algorithm used for comparing floating point numbers is also different from that of integers. I would suggest you go through the Floating-Point Representation to understand this better. Also, the memory allocation in case of floating-point numbers is more than that of integers. For more information please refer to the documentation of integers and floating-point numbers.

2 Commenti
Mostra NessunoNascondi Nessuno

Jan il 4 Set 2019

For the == operator the floating point representation matters only for NaNs, because NaN==NaN must reply false even if the bit representation is equal. For everything but NaN, comparing a double or a vector of 8 UINT8 is equivalent.

Guillaume il 4 Set 2019

And of course, if the original vector is a 64-bit integer type, then there's the same number of bytes to compare. I would still expect double comparison to be marginally slower due to the need to test for NaN indeed. Plus if I recall correctly modern processors have different pipelines for FP and integer.

Accedi per commentare.

Comparison between elements of matrix of different data type

3 Commenti
Mostra 1 commento meno recenteNascondi 1 commento meno recente

Risposte (1)

2 Commenti
Mostra NessunoNascondi Nessuno

Vedere anche

Categorie

Tag

Prodotti

Community Treasure Hunt

Comparison between elements of matrix of different data type

3 Commenti Mostra 1 commento meno recenteNascondi 1 commento meno recente

Risposte (1)

2 Commenti Mostra NessunoNascondi Nessuno

Vedere anche

Categorie

Tag

Prodotti

Community Treasure Hunt

3 Commenti
Mostra 1 commento meno recenteNascondi 1 commento meno recente

2 Commenti
Mostra NessunoNascondi Nessuno