Iterate over struct with length>1, with multiple fields

I have a struct with size>1, and multiple 16 fields.
Each fieldname corresponds to some quantity or property, and the struct stores these properties for 47 different items.
I'm trying to iterate the whole dataset. Preferrably, I'd like to iterate by fieldname and retrieve an array for each filename because within each field name the variable type is uniform.
To illustrate:
K>> teststruct = struct('name', ['Alice', 'Bob', 'Eve'], 'age', {24, 45, 35})
teststruct =
1×3 struct array with fields:
name
age
This shows up nicely as a table in the worspace browser.
However, if I iterate by fieldname, this goes wrong (spurious empty lines removed for readability):
K>> fnames = fieldnames(teststruct);
K>> teststruct.(fnames{1})
ans =
'AliceBobEve'
ans =
'AliceBobEve'
ans =
'AliceBobEve'
What I wanted was an array of all the names, instead I get three answers, of compounded names.
If I do the same with the 'age' field, at least each answer contains only one age, but they're still not in any kind of structure I could use in a code which does not know the size or field names of a struct it needs to deal with. In fact, if I assign the above to a variable, this happens:
K>> names = teststruct.(fnames{1})
names =
'AliceBobEve'
Doing the same with the other field only gives me the first age. One correct piece of data at least, but still not at all what I wanted...
I tried applying the code I found here, which promises to print the contents of an entire struct, but the same happens: I only get the first value of everything.
I know that I could loop over the struct indices first, and access them like this:
value = testsruct(i).(fieldnames(j))
...but then I'd be getting them separately, one by one, instead of getting back the cell array (or any other kind of array) that was used to define the struct in the first place, which is way easier to deal with.
Is that possible somehow?

3 Commenti

The names ('Alice', 'Bob', 'Eve') are combined before you ever create the struct, when you do this:
['Alice', 'Bob', 'Eve']
ans = 'AliceBobEve'
"Is that possible somehow?"
Of course. Read this if you want to understand comma-separated lists and how to use them:
Also note that square brackets are a concatenation operator (not a "list" operator as some beginners incorrectly think, MATLAB does not have a "list" type). As Voss correctly pointed out, when you concatenate character vectors together you get one character vector:
['Alice', 'Bob', 'Eve']
ans = 'AliceBobEve'
You probably want a cell array:
{'Alice', 'Bob', 'Eve'}
ans = 1x3 cell array
{'Alice'} {'Bob'} {'Eve'}
For example:
S = struct('name', {'Alice', 'Bob', 'Eve'}, 'age', {24, 45, 35})
S = 1x3 struct array with fields:
name age
C = {S.name} % comma-separated list
C = 1x3 cell array
{'Alice'} {'Bob'} {'Eve'}
Yes, I've been mired in Python for over a decade, and it kind of shapes expectations, especially if the same syntax appears to do the same thing in some cases... thanks for the extra explanation. Definitely going to read up on this again.

Accedi per commentare.

 Risposta accettata

Voss
Voss il 16 Lug 2024
Modificato: Voss il 16 Lug 2024
teststruct = struct('name', {'Alice', 'Bob', 'Eve'}, 'age', {24, 45, 35})
teststruct = 1x3 struct array with fields:
name age
names = {teststruct.name}
names = 1x3 cell array
{'Alice'} {'Bob'} {'Eve'}
ages = [teststruct.age]
ages = 1x3
24 45 35
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
Or, without hard-coding the field name references:
fnames = fieldnames(teststruct);
for ii = 1:numel(fnames)
values = {teststruct.(fnames{ii})};
disp(sprintf('field #%d (%s):',ii,fnames{ii}));
disp(values);
end
field #1 (name):
{'Alice'} {'Bob'} {'Eve'}
field #2 (age):
{[24]} {[45]} {[35]}
In this case each set of field values is stored in a cell array (previously the numeric ages were put into a numeric array) since a cell array can contain any class of variable.

6 Commenti

Yes, that explains most of it, and definitely clears up some misconceptions I had, thank you!
...although this creates one issue: I know that the values in one field are of the same type, and if they're numbers, I'd like to concatenate them so I can do maths to them without running endless loops:
ages = [teststruct.age]
agediff = ages - refstruct.age %let's say refstruct is always 1x1
The same method would fail for strings, since it would just merge them into one, which means that I must use cell arrays for them. Cell arrays for numerical values, though, prevent me from running calculations on them.
I somehow expected that there was some way to deal with strings in a struct array in basically the same way as with numbers.
The way I solved it is by using a lot more ifs than I used to:
% `teststruct` is n x m struct array (may be 1x1 but not empty)
length_s = length(teststruct)
fnames = fieldnames(teststruct)
results_raw = struct()
for i = 1:length(fnames)
fname = fnames_common{i};
if length_s == 1:
results_raw.(fname) = somecode(teststruct.(fname))
elseif isnumeric([teststruct.(fname)])
% uses [] to concatenate numbers to array and avoid looping over cell arrays of unknown types
partialresult_raw = somecode([teststruct.(fname)])
% result goes into n x m cell array, for consistency with other results
results_raw.(fname) = num2cell(partialresult_raw);
else
% produces cell array, and following code needs to figure out
% the type of contents and treat them accordingly to produce n x m cell arrays
results_raw.(fname) = someothercode({struct1.(fname)})
end
% ... and assemble the output into a structure that actually resembles the original
if length_s == 1:
results = results_raw;
else
% results_raw is a 1x1 struct containing n x m cell arrays but I want the old size back
% so we create a cell array of {fieldname , {values}} pairs...
args_cells = cellfun(@(fname) {fname, results_raw.(fname)},...
fnames, 'UniformOutput', false);
% ... flatten that ... (to {name1, {values1}, name2, {values2},... }
args = [args_cells{:}];
% and pass it to struct, so we get an n x m struct instead
results = struct(args{:});
The code seems to be working fine so far, though I keep expecting that there should be a more elegant way.
There might be some condition which would cause it to break, but I might be lucky not to have encountered it in my tests (yet... hints welcome).
Aside:
copy/paste directly from the MATLAB editor GUI into the message window here is partially broken. Whatever I try to paste into a block of code, it will land outside of the code block, creating holes in larger code blocks if necessary. Those holes cannot be deleted, so I need to copy/paste the code from one side of the hole into the other end whenever that happens. copy/paste from a plain text editor works fine, though.
Similarly, middle-click insert anywhere into the message box also fails (starts auto-scrolling instead, as it would outside of text input fields).
(I'm running Matlab 2023b on Kubuntu 22.04)
You're welcome!
"I somehow expected that there was some way to deal with strings in a struct array in basically the same way as with numbers."
You can, if you use strings rather than character vectors. That is:
['Alice', 'Bob', 'Eve'] % concatenating character vectors makes a character vector
ans = 'AliceBobEve'
vs
["Alice", "Bob", "Eve"] % concatenating strings makes a string array
ans = 1x3 string array
"Alice" "Bob" "Eve"
So if your struct array contains character vectors
teststruct = struct('name',{'alice','carol';'bob','dennis'},'age',{41,43;42,44})
teststruct = 2x2 struct array with fields:
name age
teststruct(1)
ans = struct with fields:
name: 'alice' age: 41
teststruct(2)
ans = struct with fields:
name: 'bob' age: 42
you can do some pre-processing to convert any character array in it into a string:
fnames = fieldnames(teststruct);
for jj = 1:numel(fnames)
for ii = 1:numel(teststruct)
if ischar(teststruct(ii).(fnames{jj}))
teststruct(ii).(fnames{jj}) = string(teststruct(ii).(fnames{jj}));
end
end
end
teststruct(1)
ans = struct with fields:
name: "alice" age: 41
teststruct(2)
ans = struct with fields:
name: "bob" age: 42
And then concatenate the values using the same syntax, regardless of whether they are numerics or strings:
names = [teststruct.name]
names = 1x4 string array
"alice" "bob" "carol" "dennis"
ages = [teststruct.age]
ages = 1x4
41 42 43 44
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
You can use reshape to get those results to be the same size as the original struct array:
names = reshape([teststruct.name],size(teststruct))
names = 2x2 string array
"alice" "carol" "bob" "dennis"
ages = reshape([teststruct.age],size(teststruct))
ages = 2x2
41 43 42 44
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
So a general code might look like:
[n,m] = size(teststruct);
fnames = fieldnames(teststruct);
results = repmat(struct(),[n,m]); % n-by-m struct with no fields
for ii = 1:numel(fnames)
fname = fnames{ii};
C = num2cell(reshape(somecode([teststruct.(fname)]),[n,m])); % use num2cell to make a cell array C containing the values
[results.(fname)] = C{:}; % so that here you can assign the contents of C to each corresponding struct element of results
end
results
results = 2x2 struct array with fields:
name age
results(1)
ans = struct with fields:
name: "alice_processed" age: 82
results(2)
ans = struct with fields:
name: "bob_processed" age: 84
function out = somecode(in)
if isnumeric(in)
out = in * 2;
else % string assumed
out = in + "_processed";
end
end
Nice! That's actually a good deal more compact/uniform. Is char is the only data type (that I could encounter in struct field) which I'd need to pre-filter for?
Is char is the only data type (that I could encounter in struct field) which I'd need to pre-filter for?
No. Function handles cannot be concatentated; they'd have to be stored in a cell array.
In general, you'd not necessarily be able to concatenate the values of a given field in each element of a structure array. For instance, you could have numeric arrays with incompatible sizes that cannot be concatenated, or tables that cannot be concatenated because the same variable appears in each.
You'd also not necessarily want to concatenate the values of a given field in each element of a structure array, because the result is misleading, e.g.:
S = struct('x',{1, [2 3], []});
S.x
ans = 1
ans = 1x2
2 3
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
ans = []
[S.x]
ans = 1x3
1 2 3
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
In that example, S has 3 elements, and [S.x] has 3 elements, but it's not true that each element of [S.x] corresponds to the value of field x in some element of S.
However, if the value of each field is a scalar, then those issues are not a concern, and that would work for all numeric classes, as well as strings, logicals, structures (if they have the same set of fields), and cell arrays.
If you have any non-scalar field values or need to support generic data types, then use the more general cell array approach.
Thanks! It seems that all such heterogeneous constructs are in cell arrays in the data that I'm dealing with, and it kind of made sense that they would always have to be... so the case you show here would actually also break the code you previously showed. I believe I would have noticed by now if something like this occurred in the data I'm dealing with, though, because isnumeric(S.x) already produces an exception.
Cell arrays should not be an issue because my somecode() routine is already catching them separately and iterating over them, per element.
Anyway, this means that if I want to be quite safe, I should iterate over each element of a struct that I encounter. Good to know... also somewhat unfortunate since Matlab has those nice vectors and matrices ready to use, but I can't make sure that I can treat them as such (unless I know where they come from). I guess I could check whether a non-scalar struct's fields are uniformly shaped, but at that point it would get tedious.
=> I think I'm good now, thanks for the help!
You're welcome!
By the way, isnumeric(S.x) will throw an error about too many input arguments if S has more than one element (or an error about not enough input arguments if S is empty), because S.x is a comma-separated list (which is used as a set of input arguments to isnumeric in this case).
A way to check whether all values of a given field in a structure array are numeric is:
all(cellfun(@isnumeric,{S.x}))
which works regardless of what each x is in S, because they are put in a cell array ({S.x}), then cellfun iterates over the cells, running isnumeric on each cell's contents, getting true or false for each and concatenating those logicals into a single vector, which is passed to all.

Accedi per commentare.

Più risposte (0)

Categorie

Prodotti

Release

R2023b

Richiesto:

il 16 Lug 2024

Modificato:

il 22 Lug 2024

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by