Finding the max number of non-zero repeating elements for each row in a large matrix

Question

Marin Genov il 28 Lug 2022

2
Link

Link diretto a questa domanda

https://it.mathworks.com/matlabcentral/answers/1769980-finding-the-max-number-of-non-zero-repeating-elements-for-each-row-in-a-large-matrix

Commentato: Marin Genov il 29 Lug 2022

Risposta accettata: Matt J

I have a (large) matrix of type 'uint8' or 'uint16', for (a small) example

A = [0 0 2 2 3;

0 1 3 3 3;

0 0 0 0 0;

1 2 2 2 3];

One can make the following assumptions on A:

size(A,2) is relatively small, e.g.
size(A,1) is big, i.e. a few hundred thousands of rows, possibly even a few millions;
the rows of A are sorted in an ascending manner;
the elements of A are bounded above by a constant M that is sufficiently smaller than the maximum the type allows, e.g. for 'uint8' and for 'uint16'.

I would like to compute a vector

b = [2;

3;

0;

3];

assigning to each row of A the max number of repetitions of the non-zero elements contained in that row. Now, the way I do this currently is the following vectorized code running on my GPU:

a = permute(unique(A(A~=0)),[3 2 1]); % determine the unique non-zero values of A;

b = max(sum(uint8(bsxfun(@eq,A,a)),2,'native'),[],3); % compare A with a and count each value, then take max.

But if the matrix A is large and has a lot of variance (hence so does a), there appears to be a lot of unnecessary computation. So I was wondering if there is a faster way to do this since I have to do it for a lot of such large matrices.

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

Accedi per rispondere a questa domanda.

Answer 1

Matt J il 28 Lug 2022

2
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/1769980-finding-the-max-number-of-non-zero-repeating-elements-for-each-row-in-a-large-matrix#answer_1017095

Modificato: Matt J il 28 Lug 2022

Apri in MATLAB Online

size(A,2) is relatively small,

That being the case, I'd just use a loop:

A = [0 0 2 2 3;
     0 1 3 3 3;
     0 0 0 0 0;
     1 2 2 2 3];
[m,n]=size(A);
count=zeros(m,1);
b=-inf(m,1);
Alast=A(:,1);
for i=1:size(A,2)
    Ai=A(:,i);
    
   
      increm=(Ai==Alast & Alast~=0);
    count(increm)=count(increm)+1;
      reset=~increm;
    count(reset)=(Ai(reset)~=0);
    b=max(b,count);
    Alast=Ai;
end
b
b = 4×1
     2
     3
     0
     3

2 Commenti
Mostra NessunoNascondi Nessuno

Marin Genov il 29 Lug 2022

Thanks, this is a nice idea! I will make some performance comparisons in the coming days.

Marin Genov il 29 Lug 2022

BTW, a very nice bonus of your method is that I can stack multiple such matrices of the same size into a 3D array and process them simultaneously, thus making even better use of vectorization on my GPU for even faster performance.

Accedi per commentare.

Answer 2

Bruno Luong il 29 Lug 2022

2
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/1769980-finding-the-max-number-of-non-zero-repeating-elements-for-each-row-in-a-large-matrix#answer_1017300

Modificato: Bruno Luong il 29 Lug 2022

Apri in MATLAB Online

This code would not requires too much RAM beside a copy of A transposed.

It somesort of vectorized runlength encoding. A little bit cryptic code I must admit.

The for-loop posted by Matt is the best IMO

A = [0 0 2 2 3;
     0 1 3 3 3;
     0 0 0 0 0;
     1 2 2 2 3];
[m,n] = size(A);
B = [A.'; Inf(1,m)];
[c,r] = find([ones(1,m); diff(B)]);
Aij = B(c+(r-1)*(n+1));
runlength = diff(c).*(diff(r)==0);
runvalue = Aij(1:end-1);
lgtnz = runlength.*(runvalue>0); % does count the length of sequence of 0s
maxlgt = accumarray(r(1:end-1), lgtnz, [m,1], @max)
maxlgt = 4×1
     2
     3
     0
     3

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

Answer 3

Matt J il 28 Lug 2022

1
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/1769980-finding-the-max-number-of-non-zero-repeating-elements-for-each-row-in-a-large-matrix#answer_1017105

Modificato: Matt J il 28 Lug 2022

Apri in MATLAB Online

This should be fine for the uint8 case. For the uint16 case, it might strain RAM.

A = [0 0 2 2 3;
     0 1 3 3 3;
     0 0 0 0 0;
     1 2 2 2 3];
N=size(A,1);
M=max(A(:));
[I,J,S]=find(A);
b=max( accumarray([I,S],1,[N,M]) ,[],2)
b = 4×1
     2
     3
     0
     3

2 Commenti
Mostra NessunoNascondi Nessuno

Marin Genov il 29 Lug 2022

Modificato: Marin Genov il 29 Lug 2022

Apri in MATLAB Online

Thanks! This works nicely for uint8, but something goes awry for uint16 with large first dimension: the size of the output of

accumarray([I,S],1,[N,M])

appears to get truncated to 65535 instead of being equal to N. I suspect that for large uint8 matrices the size will be truncated to 255, but I haven't tested that yet.

Bruno Luong il 29 Lug 2022

Modificato: Bruno Luong il 29 Lug 2022

Apri in MATLAB Online

I think because

[I,S]

is converted to the lower class of I and S, which is uint8/uint16. Thus I is truncated to intmax of the class.

you can fix by replace with

[I,double(S)]

Accedi per commentare.

Finding the max number of non-zero repeating elements for each row in a large matrix

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposta accettata

2 Commenti
Mostra NessunoNascondi Nessuno

Più risposte (2)

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

2 Commenti
Mostra NessunoNascondi Nessuno

Vedere anche

Categorie

Tag

Prodotti

Release

Community Treasure Hunt

Finding the max number of non-zero repeating elements for each row in a large matrix

0 Commenti Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposta accettata

2 Commenti Mostra NessunoNascondi Nessuno

Più risposte (2)

0 Commenti Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

2 Commenti Mostra NessunoNascondi Nessuno

Vedere anche

Categorie

Tag

Prodotti

Release

Community Treasure Hunt

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

2 Commenti
Mostra NessunoNascondi Nessuno

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

2 Commenti
Mostra NessunoNascondi Nessuno