Comparing two cell arrays of strings of different sizes

34 visualizzazioni (ultimi 30 giorni)
Dear all,
I'm trying to compare two cell arrays of strings, a lot like the following:
A = [X_ABCDE;X_BCDEA;X_BCED]
B = [X_A;X_B]
and the resulting array should be:
index = [1;2;2]
as I have a matrix D = [B, C] with C containing the cells I want to end up with.
I wanted to do it using a for-loop and a nested for-loop, but, as A is 8000x1 and D is 2000x2, this will take forever. I tried using strcmp and ismember, but these only work when the cells (or strings) are identical. strfind doesn't work either, because when I use strfind I need to have a for-loop over array A, and strfind only works if the smallest of both is a string.
for iCell = 1:length(A)
index = strfind(A(iCell), cell2mat(D(iCell,1)));
end
A string in B is ALWAYS shorter than in A and a string in B can occur more than once in A. I hope you can help me out :)
  2 Commenti
Stephen23
Stephen23 il 27 Ott 2015
Modificato: Stephen23 il 27 Ott 2015
Your state that: "I have a matrix D = [B, C] with C containing the cells I want to end up with". but if the matrix D is an output (i.e. is defined by what you "end up with"), then how can D be accessed within the loop?: D(iCell,1). How does D get defined if C is an output?
Jasper Admiraal
Jasper Admiraal il 27 Ott 2015
Sorry, I got it wrong. I want to end up with a certain order of the strings from C, indicated by the array index. In my code, a cell in B contains a code which corresponds to a specification in C (in the same row).

Accedi per commentare.

Risposta accettata

Stephen23
Stephen23 il 27 Ott 2015
Modificato: Stephen23 il 27 Ott 2015
Use strncmp (not strcmp) to make the code simpler. Loop over the smaller cell array B, and pass the larger cell array A whole to strncmp:
A = {'X_ABCDE';'X_BCDEA';'X_BCED'}
B = {'X_A';'X_B'}
%
X = zeros(size(A));
for k = 1:numel(B)
X(strncmp(A,B{k},3)) = k;
end
produces:
>> X
X =
1
2
2
Do not use arrayfun or cellfun if you want fast code: they will be slower than a for-loop.

Più risposte (1)

Jos (10584)
Jos (10584) il 27 Ott 2015
assuming B is all unique
A = {'X_ABCDE' ; 'X_BCDEA' ; 'X_BCE'}
B = {'X_A' ; 'X_B'}
index = arrayfun(@(k) find(strncmp(A{k},B,3)), 1:number(A))
  2 Commenti
Stephen23
Stephen23 il 27 Ott 2015
Modificato: Stephen23 il 27 Ott 2015
My solution is much faster than this one (100 iterations)
Elapsed time is 0.0370002 seconds. % my answer
Elapsed time is 1.623 seconds. % this answer
Using arrayfun is slower than using a loop.
Jos (10584)
Jos (10584) il 28 Ott 2015
of course it is. I just wanted to show a different answer ... :-)

Accedi per commentare.

Categorie

Scopri di più su Loops and Conditional Statements in Help Center e File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by