find multiple words in a cell

Hi, is there a way to compare and return a number for the matching words in two unequal cells?
A={'a','a','a','b','b','d'} B={'a','b','c','d','e','f','g','h'}
return C=3,2,0,1,0,0,0,0
Thank you

 Risposta accettata

A one-liner:
cellfun(@(x)sum(ismember(A,x)), B)

1 Commento

Vincent I
Vincent I il 9 Gen 2013
Fantastic, thank you . thats what I was looking for

Accedi per commentare.

Più risposte (3)

Matt J
Matt J il 9 Gen 2013
If the "words" will really always be single letters, you could do it looplessly with
>> histc([A{:}], [B{:}])
ans =
3 2 0 1 0 0 0 0

6 Commenti

Vincent I
Vincent I il 9 Gen 2013
ok I missed something on my initial comment. what if
A={'a_1','ab','a','b','b','d'}
B={'a','b','c','d','e','f','g','h'}
return C=3,2,0,1,0,0,0,0
Matt J
Matt J il 9 Gen 2013
Post this as a new question. When you do, explain why 'b' gets only a count of 2 even though it occurs 3 times in A. Also explain what would happen if a letter occurs twice in the same string, e.g.
A={'a_1ab','bb'}; B={'a','b'};
Vincent I
Vincent I il 9 Gen 2013
Modificato: Vincent I il 9 Gen 2013
ohh I see... b should get a count of two...
A={'a_1','a_2','a','b','b','d'}
B={'a','b','c','d','e','f','g','h'}
return C=3,2,0,1,0,0,0,0
And, although this is a new question, I feel that it is not far away for the initial question which I dont see why should require a new post
thank you
@Vincent So you are looking for non-exact matches? I don't see your accepted answer works for that.
Matt J
Matt J il 9 Gen 2013
Modificato: Matt J il 9 Gen 2013
And, although this is a new question, I feel that it is not far away for the initial question which I dont see why should require a new post
It doesn't require it, but if you post it as a new question, people have the opportunity to gain points from answering you (now that you've already accepted Ryan's answer), and so will be more incentivized to do so.
Jan
Jan il 10 Gen 2013
Modificato: Jan il 10 Gen 2013
The decision is easy:
New question, new thread.
And:
Additional information to an existing question is added by editing the question and marking the changes by "[EDITED]". Then this is clarification and *not* a new question.
Hiding important information in deeply nested comments to already accepted questions is a bad idea.

Accedi per commentare.

Daniel Shub
Daniel Shub il 9 Gen 2013
I am sure that this is over thinking the solution and I doubt that using regexp is optimal, but I was curious how bad it would be.
x = regexp(A, cell2mat(cellfun(@(x)['(?<', x, '>', x, ')|'], B, 'UniformOutput', false)), 'names');
cellfun(@(x)length([y.(x)]), fieldnames([x{:}]))'
I is there a better way to do this with regexp?

1 Commento

Vincent I
Vincent I il 9 Gen 2013
Modificato: Vincent I il 9 Gen 2013
Solved my problem by doing the folowing:
A=TF(:,1).';
A=regexprep(A, ' ','');
A=regexprep(A, '_(\w*)','');
B={'aa','bb2c','bb25c','xy3c','m56c','etc56c'};
C=cellfun(@(x)sum(strcmp(A, x)), B);
Thank you

Accedi per commentare.

Jan
Jan il 10 Gen 2013
[EDITED, Jan, moved from comments to the accepted question]
ISMEMBER sorts the inputs and performs a binary search. This can be much faster and much slower than an unsorted comparison by:
cellfun(@(x)sum(strcmp(A, x)), B)
I claim without a proof, that a loop is faster:
R = zeros(1, numel(B));
for iB = 1:numel(B)
R(iB) = sum(strcmp(A, B{iB}));
end
[EDITED 2] And if you want to compare the leading character(s) only:
...
R(iB) = sum(strncmp(A, B{iB}, length(B{iB}));
...

Categorie

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by