How to find an exact string match in a list of folder names
31 visualizzazioni (ultimi 30 giorni)
Mostra commenti meno recenti
Richard Rees
il 15 Mar 2020
Commentato: Image Analyst
il 16 Mar 2020
Hello erveryone,
I have a problem trying to extract data from a sequence of files, based on exact string names contained within the subfolder names. The problem I have is extracting data to Y_NODE and XY_NODE because contain cannot differentiate between 'Y_High' and 'XY_High' and is extracting all the data into Y_High variable. I have tried contains, matches, strcmp, strfind etc but I cannot get it to match correctly and assign the data to the correct cell array.
I cannot attached the raw data because it is too large, but the list of folder names is attached.
Could someone help please?
pattern = ["No_High1_add_on","X_High1_add_on","Y_High1_add_on","XY_High1_add_on"];
for k = 1:numberOfFolders
% Get this folder and print it out.
thisFolder = listOfFolderNames{k};
if contains(thisFolder,pattern(1))
J = 1;
elseif contains(thisFolder,pattern(2))
J = 2;
elseif contains(thisFolder,pattern(3))
J = 3;
elseif contains(thisFolder,pattern(4))
J = 4;
else
continue
end
filePattern = sprintf('%s/*node.csv', thisFolder);
baseFileNames = dir(filePattern);
numberOfImageFiles = length(baseFileNames);
if numberOfImageFiles >= 1
% Go through all those files.
for f = 1 : numberOfImageFiles
fullFileName = fullfile(thisFolder, baseFileNames(f).name);
if J == 1
NO_NODE{k} = importdata(fullFileName);
elseif J == 2
X_NODE{k} = importdata(fullFileName);
elseif J == 3
Y_NODE{k} = importdata(fullFileName);
elseif J == 4
XY_NODE{k} = importdata(fullFileName);
else
end
end
end
fprintf(' Folder %s has no files in it.\n', thisFolder);
end
0 Commenti
Risposta accettata
Guillaume
il 16 Mar 2020
Modificato: Guillaume
il 16 Mar 2020
It's simple to solve: rather than testing first for 'X' then 'Y' then 'XY', test first for 'XY' then 'X' or 'Y' then the other. If the first test pass, then it's guaranteed to be 'XY'.
Note that a bunch of if...elseif... that all do the same thing is usually a bad design. It's not easy to extend to many more patterns. If you had 30 different patterns, would you write 30 different tests. A loop would make the code much simpler:
pattern = ["No_High1_add_on", "XY_High1_add_on", "X_High1_add_on", "Y_High1_add_on"]; %XY pattern MUST precede X and Y pattern since it is a superset
for k = 1:numel(listOfFolderNames)
% Get this folder and print it out.
thisFolder = listOfFolderNames{k};
matchedpattern = 0
for patternindex = 1:numel(pattern)
if contains(thisFolder, pattern(patternindex))
matchedpattern = patternindex;
break
end
end
if matchedpattern == 0, continue, end %no match found
%...
Similarly later on I would not use different named variables to store the data. The design is very likely to end up forcing you to copy a bunch of time each time you want to process each variable, when again a loop would avoid the repetition. I would store the imported file in a cell array of cell arrays:
pattern = ["No_High1_add_on", "XY_High1_add_on", "X_High1_add_on", "Y_High1_add_on"]; %XY pattern MUST precede X and Y pattern since it is a superset
patterndata = cell(size(pattern)); %cell array to store the imported files for each pattern
for k = 1:numel(listOfFolderNames)
%...
for f = 1 : numberOfImageFiles
fullFileName = fullfile(thisFolder, baseFileNames(f).name);
patterndata{matchedpattern}{end+1} = importdata(fullFileName); %#ok<AGROW> Number of files in each category is unknown so have no choice but to grow the array
end
end
Note that unlike your original code, the above does not leave empty cells in each cell array. (On a given k your original code only filled one of the NO_NODE, X_NODE, etc. cell array leaving the others with an empty k cell.
Più risposte (1)
Image Analyst
il 15 Mar 2020
strcmp() should work. I'd like to see code where it doesn't. contains() won't work - it will operate as you said since 'Y_High' is contained inside 'XY_High'. But I really think strcmp() should.
At first I thought maybe it's because you're comparing strings to character arrays. Your pattern is a string array, not a cell array of character arrays like listOfFolderNames probably is. Strings and character arrays are now different types of variables in MATLAB, as of a few versions ago. But when I did a test, it shows this is not the case and they still match despite being of different variable types:
s1 = "abc" % A string
s2 = 'abc' % A character vector
e1 = isequal(s1, s2)
e2 = strcmp(s1, s2)
e3 = contains(s1, s2)
e1, e2, and e3 all show as true.
2 Commenti
Image Analyst
il 16 Mar 2020
Does this work:
locations = strfind(A, Pattern)
It tells you what index Pattern starts at in A.
Vedere anche
Categorie
Scopri di più su Data Type Conversion in Help Center e File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!