How can I get unique entries and their counts and place back into the table?
8 visualizzazioni (ultimi 30 giorni)
Mostra commenti meno recenti
When running the code given below I get the error:
[uniqueEntries, ~, entryGroupIndices] = unique(x);
↑
Error: Unsupported use of the '=' operator. To compare values for equality, use '=='. To specify name-value arguments,check that name is a valid identifier with no surrounding quotes.
I think is due to (x) not being defined or non existing.
% Sample data: create a table
data = table({'apple'; 'banana'; 'apple'; 'orange'; 'banana'; 'kiwi'; 'apple'}, ...
{'yes'; 'no'; 'yes'; 'yes'; 'no'; 'yes'; 'yes'}, ...
'VariableNames', {'Fruits', 'Var2'});
% Group the data by 'Fruits' and collect Var2 entries
summaryTable = groupsummary(data, 'Fruits', @(x) {x.Var2}, 'IncludeEmptyGroups', true);
% Create a function to count unique entries and their occurrences
countUniqueEntries = @(x) {
% Get unique entries and their counts
[uniqueEntries, ~, entryGroupIndices] = unique(x);
entryCounts = histcounts(entryGroupIndices, 'BinMethod', 'integers');
% Create a table with unique entries and their counts
table(uniqueEntries, entryCounts', 'VariableNames', {'UniqueEntries', 'Counts'})
};
% Apply the function to each group using cellfun
countTables = cellfun(countUniqueEntries, summaryTable.GroupCount, 'UniformOutput', false);
% Create the final result table
resultTable = table(summaryTable.Fruits, countTables, 'VariableNames', {'Fruits', 'Counts'});
% Display the results
disp('Unique Fruits and Their Counts:');
disp(resultTable);
The output should look something like this:
Fruits Counts
_______ _______
'apple' [3x2 table]
'banana' [2x2 table]
'kiwi' [1x2 table]
'orange' [1x2 table]
I would love to get the results without having to loop.
It would also be helpful If I can sort the counts in the counts Table 'descending'. Thank you for the help.
0 Commenti
Risposte (2)
Stephen23
il 18 Apr 2025
Modificato: Stephen23
il 18 Apr 2025
"I think is due to (x) not being defined or non existing. "
No, it is because you invented some syntax when defining the anonymous function here:
countUniqueEntries = @(x) {
% Get unique entries and their counts
[uniqueEntries, ~, entryGroupIndices] = unique(x);
entryCounts = histcounts(entryGroupIndices, 'BinMethod', 'integers');
% Create a table with unique entries and their counts
table(uniqueEntries, entryCounts', 'VariableNames', {'UniqueEntries', 'Counts'})
};
Curly braces define a cell array. Inside that cell array you called various functions (which is allowed inside curly braces) and attempted to assign their outputs to variables (which is definitely not allowed inside curly braces). It is not valid syntax to perform assignment inside the cell array operator (nor, for that matter, inside any other operators):
{x=sqrt(2)} % this is invalid syntax
Your attempt to use an anonymous function like that will not work. Write a normal function in an Mfile, then you can make as many variable assignments as you wish.
I doubt that using nested tables like that will make processing your data easier: https://xyproblem.info/
2 Commenti
Stephen23
il 18 Apr 2025
T = table({'apple'; 'banana'; 'apple'; 'orange'; 'banana'; 'kiwi'; 'apple'}, ...
{'yes'; 'no'; 'yes'; 'yes'; 'no'; 'yes'; 'yes'}, ...
'VariableNames', {'Fruits', 'Var2'})
U = groupsummary(T,'Fruits')
Walter Roberson
il 19 Apr 2025
To be more explicit:
@(x) { CODE } is not used to define a code block. @(X) { CODE } is used to define a cell array of expressions. The individual expressions must return (possibly empty) values, and must not be assignment statements or control statements.
dpb
il 19 Apr 2025
Modificato: dpb
il 19 Apr 2025
"...also be helpful If I can sort the counts in the counts Table 'descending'. "
T = table({'apple'; 'banana'; 'apple'; 'orange'; 'banana'; 'kiwi'; 'apple'}, ...
'VariableNames', {'Fruits'});
T=addvars(T,~matches(T.Fruits,'banana'),'NewVariableNames',{'Round'});
T=convertvars(T,{'Fruits'},'categorical');
U=sortrows(groupsummary(T,'Fruits',@(x)all(x),{'Round'}),'GroupCount','descend');
U=renamevars(U,{'fun1_Round'},{'Round'}) % fixup annoying funN_ prefix that can't stop
% alternative is mung on variable names directly...
%U.Properties.VariableNames=strrep(U.Properties.VariableNames,'fun1_','');
% general alternative, can use a pattern string to automate more than one
%pat='fun'+digitsPattern+'_';
%U.Properties.VariableNames=strrep(U.Properties.VariableNames,pat,'');
Although the actual logic for determing the logic state is unstated, took a guess as why 'banana' is different...
NOTA BENE that to bring along other variable(s) in the summary, one has to be able to reduce them to one statistic per group; which all does above for the characteristic variable. As noted, it would be nice if groupsummary also had the option to set 'OutputVariableNames' as does rowfun
0 Commenti
Vedere anche
Categorie
Scopri di più su Tables in Help Center e File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!