C = union( A,B ) is too slow. Is there any faster way given that A and B are ordered.
    9 visualizzazioni (ultimi 30 giorni)
  
       Mostra commenti meno recenti
    
    Cem Gormezano
 il 9 Ago 2020
  
    
    
    
    
    Commentato: John D'Errico
      
      
 il 9 Ago 2020
            I have two sorted arrays A and B. I want to find the elements that are both in A and B. I am using union function, yet this seems to be quite slow. Is there a faster way of doing this ?
2 Commenti
  Bruno Luong
      
      
 il 9 Ago 2020
				
      Modificato: Bruno Luong
      
      
 il 9 Ago 2020
  
			Well
C = [A(:); B(:)]
contains "the elements that are both in A and B".
However it's not sorted or uniquely represented. But you doesn't seem to mention those characteristics are required. 
Risposta accettata
  Bruno Luong
      
      
 il 9 Ago 2020
        
      Modificato: Bruno Luong
      
      
 il 9 Ago 2020
  
      If you have a decend C-compiler you might use my MEX MERGE SORTED ARRAY
c = mergesa(a,b); % or mergemex(a,b);
c = c([true; diff(c1)>0]);
According to my benchmark, it runs  2.5 time faster than UNION.
0 Commenti
Più risposte (3)
  Sulaymon Eshkabilov
      
 il 9 Ago 2020
        Hi,
This one could be faster:
ismember(A, B)
1 Commento
  John D'Errico
      
      
 il 9 Ago 2020
				Except ismember does not do the same thing as union. You might check to see what ismember does return.
  Walter Roberson
      
      
 il 9 Ago 2020
        The faster way would involve writing some mex code. There is an undocumented internal binary search routine, but calling it repeatedly would have too much overhead. The algorithm is easy enough to write in MATLAB but the performance would not be better than the current algorithm, which calls upon compiled routines.
0 Commenti
  John D'Errico
      
      
 il 9 Ago 2020
        
      Modificato: John D'Errico
      
      
 il 9 Ago 2020
  
      You could use unique. It would produce the same result, except that it will not be faster.
A = sort(randi(1e7,1,1e6));
B = sort(randi(1e7,1,1e6));
timeit(@() union(A,B))
ans =
            0.097985031132
timeit(@() unique([A,B]))
ans =
            0.097896961132
In fact, both codes would be faster if the arrays were not sorted.
A = randi(1e7,1,1e6);
B = randi(1e7,1,1e6);
timeit(@() unique([A,B]))
ans =
            0.044566429132
timeit(@() union(A,B))
ans =
            0.044880346132
Of course, randomizing the data will not help, as then the cost of randomization will enter into the problem.
If you truly need more speed than that, then you need to write compiled code. That is not to say you should compile MATLAB code. Since these tools will already have been compiled for speed, compiling them will not help. 
However, IF you could write EFFICIENT code that is based on the assumption that the arrays are already pre-sorted, then you could gain speed.
0 Commenti
Vedere anche
Categorie
				Scopri di più su Shifting and Sorting Matrices in Help Center e File Exchange
			
	Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!




