Comparing common strings in two large string arrays imported from excel (using xlsread)

3 visualizzazioni (ultimi 30 giorni)
I have two very large string arrays (one of them 2000 rows and the other one is 7000 rows) that I import using xlsread and want to compare with each other to see if they have any common elements. The STRR part is always the same string and all I need to do is to compare the numerical part. The ebtries in rows of each column are not repeated.
As output it will be enough to have the numbers of the rows from left and right columns where a common phrase appears. In this case the output could look like this:
1 4 (orange)
17 1 (yellow)
9 10 (blue)
and so on.
Is it possible to do this without a loop?
  2 Commenti
madhan ravi
madhan ravi il 15 Feb 2019
See if the below satisfies your needs:
a=str2double(regexp(a,'\d*','match','once')); % first column
b=str2double(regexp(b,'\d*','match','once')); % second column
T=table;
T.a=find(ismember(a,b))
T.b=find(ismember(b,a))

Accedi per commentare.

Risposta accettata

dpb
dpb il 15 Feb 2019
[c,ia,ib]=intersect(categorical(s1),categorical(s2));

Più risposte (1)

OCDER
OCDER il 15 Feb 2019
%Generating a demo cell array
C = cell(10000, 2);
for j = 1:numel(C)
C{j, 1} = sprintf('STRR %d', j);
C{j, 2} = sprintf('STRR %d', j-2);
end
%Use intersect to determine location of matching entities between column 1 and 2
tic
[Matched, Col1, Col2] = intersect(C(:, 1), C(:, 2)); %NOTE: this doesn't work if there are duplicate entries in the colum
toc %0.063 sec
%If you plan to do numerical calculations, convert string to number via something like this
tic
Str = strrep(C, 'STRR ', '');%Deletes "STRR " from every cell
Num = cellfun(@(x) sscanf(x, '%f'), Str); %Convert remaining char to double format
[Matched, Col1, Col2] = intersect(Num(:, 1), Num(:, 2)); %NOTE: this doesn't work if there are duplicate entries in the colum
toc %0.359 sec

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by