Finding duplicate strings in a cell array and their index
Mostra commenti meno recenti
I have to convert a cell array with more than 100,000 elements and convert it to a structure array with four fields. Right now, I have something like:
% cell array = nameData
n = 1;
for j = 2:102
for i = 2:length(nameData)
S(n).name = nameData{i,j};
S(n).frequency = 1;
n = n+1;
end
end
However, I need to find duplicate strings in this array, and find information about them. Basically, I am collecting a database of strings and if I run across a duplicate, increase the frequency of that string rather than adding it to the structure.
I had been using loops within the previous two loops to achieve this:
for k = 1:n
if strcmpi(S(k).name, nameData{i,j}
S(k).frequency = S(k).frequency + 1;
end
end
However, I always just end up with all 100,000 structure elements. Any other solution I have gotten to work was entirely too slow, and this conversion from cell to structure array must happen in less than 20 seconds.
Thanks!
2 Commenti
Paul Wintz
il 10 Set 2021
The use of i and j as index variables are so ubiquitous to programming that I would say, instead, that you should avoid using i and j as the imaginary unit, and instead use 1i or 1j, which cannot be overwritten.
Risposta accettata
Più risposte (0)
Categorie
Scopri di più su Matrix Indexing in Centro assistenza e File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!