Understanding the structfun() or cellfun() commands

71 visualizzazioni (ultimi 30 giorni)
Dear All,
I want to avoid the use of for loops by using the structfun() and cellfun() commands.
I have a folder with a bunch of "nnn_M.csv" files, where the "nnn" prefix corresponds to numbering of the files, and the "_M" suffix being constant for all files. My goal is to create a double array of the "nnn" values.
The code I currently use with a for loop: (Works)
Files = dir('*.csv'); % Create the structure of file descriptions (name,datenum,...)
N = length(Files); % Determind # of files for the for loop
for ii = 1:N
FileNames{ii} = Files(ii).name; % Create cell array of the file names
nnn(ii) = sscanf(FileNames{ii},'%d_M'); % Create double array of file prefixes
end
This is the code without a for loop: (Does not work)
Files = dir('*.csv'); % Create the structure of file descriptions (name,datenum,...)
FileNames = structfun(@(x) Files(x).name, Files); % Create cell array of the file names
nnn = cellfun(@(x) sscanf(FileNames{x},'%d_M'),FileNames); % Create double array of file prefixes
The 2nd and 3rd lines give me errors, repectively:
Error using structfun
Inputs to STRUCTFUN must be scalar structures.
% and
Index exceeds the number of array elements (14).
Error in @(x)sscanf(FileNames{x},'%dK')
% When using the correct 'FileNames' the for loop gives
% There are 14 *.csv files in the folder
I welcome all suggestions you may have. Thank you for helping me understand these functions!
Cheers,
-Jackson

Risposta accettata

dpb
dpb il 24 Apr 2021
As the error message and doc says, structfun applies a function to each field in a scalar struct; you have a struct array -- not what you want.
And, for the specific desire you don't need either function nor a loop construct, either -- use MATLAB vectorized notation--
nums=str2double(extractBefore(string({Files.name}),'_'));
  2 Commenti
Jackson Kock
Jackson Kock il 24 Apr 2021
Thank you dpb for the answer, which does work as needed.
A couple follow up comments/ questions if I may:
1) I learned that the {} around Files.name creates the cell array of the file names. Such as,
A = {Files.name};
The reason I need to {} is to create the array? Otherwise the command,
A = Files.name;
only knows to grab the first file's name?
2) I do not understand my initial error with the cellfun() command. If you couple please enlighten me on this regard. It seems like the function was doing something but the indexing was incorrect?
3) Could I still use a cellfun() to get the numbers in front of the file name? For example:
A = {Files.name};
B = cellfun(??,A); % I do not know what to put here
I want to understand how to use an arbitrary function inside the cellfun() command to be able to use them in the future. I understand the case of simpler functions, such as:
C = {[1,2],[3,4]};
D = cellfun(@mean,C);
Most appreciated.
dpb
dpb il 24 Apr 2021
Modificato: dpb il 24 Apr 2021
1) Try at the command line and see what Files.name returns. (Hint: search for "comma-separated list" in doc)
2) Since the first failed, I don't know what the content of the cell array was when you tried it so it's not possible to say just what, but probably you had a smaller array than thought and passed inconsistent ones
3a) Sure, but why? You want to avoid loops; cellfun is a loop in sheep's clothing; underneath it is the loop with more overhead than just the direct for...end construct. It has its place, certainly, but isn't always the better solution. But, for illustration
>> cellfun(@(s)sscanf(s,'%d_M'),{Files.name})
ans =
0 1.00
>>
3b) As the above illustrates, you write an anonymous function in place of the function handle--it can be any one-line expression. Or, if you can't manage it in one line, use a handle to the m-file function you write. NB: one feature of anonymous functions is that the embed any workspace variables not in their argument list in the function body itself -- the same result as above could be obtained by
>> arrayfun(@(i)sscanf(Files(i).name,'%d_M'),1:numel(Files))
ans =
0 1.00
>>
where the Files struct is embedded in the anonymous function. While not optimal here, this can be extremely useful when other parameters are needed to evaluate the function.

Accedi per commentare.

Più risposte (1)

Stephen23
Stephen23 il 24 Apr 2021
Avoiding CELLFUN or STRUCTFUN is simpler and much more efficient:
S = dir('*_M.csv');
V = sscanf([S.name],'%f_M.csv')
V = 3×1
1.2000 3.4000 5.6000

Prodotti


Release

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by