What is the best way to count the occurrences of two table columns together?
7 visualizzazioni (ultimi 30 giorni)
Mostra commenti meno recenti
Monika Jaskolka
il 17 Mar 2021
Commentato: Monika Jaskolka
il 17 Mar 2021
I have a table that looks like this:
>> A = {'A'; 'B'; 'C'; 'A'; 'C'; 'C'; 'B'};
>> B = {'x'; 'y'; 'z'; 'xx'; 'z'; 'z'; 'y'};
>> T = table(A,B)
T =
A B
___ ____
'A' 'x'
'B' 'y'
'C' 'z'
'A' 'xx'
'C' 'z'
'C' 'z'
'B' 'y'
What is the best way of getting a result like the one blelow? I essentially want to count the occurrances of column A and B., and have as output a table.
ans =
A B occ
____ ____ ____
'A' 'x' 1
'A' 'xx' 1
'B' 'y' 2
'C' 'z' 3
The immediate solution that comes to mind is to combine the table columns together, and then use histc, but is there a better approach that will leave me with a Table where A and B aren't combined? Or is usng a strsplit on T2.AB the way to go?
>> c = strcat(T.A, ',', T.B)
c =
7×1 cell array
'A,x'
'B,y'
'C,z'
'A,xx'
'C,z'
'C,z'
'B,y'
>> [AB, ~, J] = unique(c);
>> occ = histc(J, 1:numel(AB));
>> T2 = table(AB, occ)
T2 =
AB occ
______ ___
'A,x' 1
'A,xx' 1
'B,y' 2
'C,z' 3
0 Commenti
Risposta accettata
Star Strider
il 17 Mar 2021
Try this:
A = {'A'; 'B'; 'C'; 'A'; 'C'; 'C'; 'B'};
B = {'x'; 'y'; 'z'; 'xx'; 'z'; 'z'; 'y'};
T = table(A,B);
[Tu,~,ix] = unique(T, 'rows', 'stable');
Tally = accumarray(ix, 1);
TallyTable = [Tu, table(Tally)]
producing:
TallyTable =
4×3 table
A B Tally
_____ ______ _____
{'A'} {'x' } 1
{'B'} {'y' } 2
{'C'} {'z' } 3
{'A'} {'xx'} 1
0 Commenti
Più risposte (2)
Steve Eddins
il 17 Mar 2021
In my previous answer, I forgot that grpstats is in the Statistics and Machine Learning Toolbox. Here is an answer using groupsummary, which is in MATLAB.
>> groupsummary(T,["A" "B"])
ans =
4×3 table
A B GroupCount
___ ____ __________
"A" "x" 1
"A" "xx" 1
"B" "y" 2
"C" "z" 3
Steve Eddins
il 17 Mar 2021
I think grpstats will do what you want:
>> grpstats(T,["A" "B"])
ans =
4×3 table
A B GroupCount
_____ ______ __________
A_x {'A'} {'x' } 1
A_xx {'A'} {'xx'} 1
B_y {'B'} {'y' } 2
C_z {'C'} {'z' } 3
Also, tabular text analysis will generally be more efficient and easier to perform if you work with strings instead of cell arrays of char vectors:
>> A = ["A" ; "B" ; "C" ; "A" ; "C" ; "C" ; "B" ]
A =
7×1 string array
"A"
"B"
"C"
"A"
"C"
"C"
"B"
>> B = ["x" ; "y" ; "z" ; "xx" ; "z" ; "z" ; "y" ]
B =
7×1 string array
"x"
"y"
"z"
"xx"
"z"
"z"
"y"
>> T = table(A,B)
T =
7×2 table
A B
___ ____
"A" "x"
"B" "y"
"C" "z"
"A" "xx"
"C" "z"
"C" "z"
"B" "y"
>> grpstats(T,["A" "B"])
ans =
4×3 table
A B GroupCount
___ ____ __________
A_x "A" "x" 1
A_xx "A" "xx" 1
B_y "B" "y" 2
C_z "C" "z" 3
1 Commento
Steve Eddins
il 17 Mar 2021
Note that grpstats is in the Statistics and Machine Learning Toolbox. I posted a separate answer using groupsummary, which is a newer function that is in MATLAB.
Vedere anche
Categorie
Scopri di più su Logical in Help Center e File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!