most frequent word in cell array

Question

1 voto

Hi, I have a cell array "P" of size 2000 by 20. Each cell value is either "Yes" or "No". How can I make a new cell array "vote" of size 2000 by 1 that each cell contains the most frequent word of each row in P?

0 Commenti
Mostra -2 commenti meno recenti Nascondi -2 commenti meno recenti

Accedi per commentare.

Accedi per rispondere a questa domanda.

Follow Question

Answer 1

Walter Roberson il 25 Ott 2017

Apri in MATLAB Online

1 voto

tf = ismember(lower(P), 'yes');
votes = sum(tf, 2);

4 Commenti
Mostra 2 commenti meno recenti Nascondi 2 commenti meno recenti

dpb il 26 Ott 2017

Modificato: dpb il 26 Ott 2017

I was just throwing in the categorical variable into the mix in the end on top of your solution for the total by row using string matching thereby "mixing the two" of the cellstr variable to the categorical variables I had suggested totally (which did the sums via countcats) by then using the categorical to display the name in English of the winner...

Wasn't imply anything at all was wrong, just adding the final step and that primarily to "show off" categorical to the OP as worth looking at.

Walter Roberson il 26 Ott 2017

Right, but I had overlooked that the question asked about the most common entry -- which can be found by testing the count against width/2

Accedi per commentare.

Answer 2

dpb il 25 Ott 2017

Modificato: dpb il 25 Ott 2017

Apri in MATLAB Online

2 voti

Good place to use categorical variables instead of the cellstr...

Example:

>> yn={'yes' 'no' 'Yes';'no' 'No', 'NO'};   % minimal dataset including capitaliztion differences
>> ync=categorical(lower(yn));              % convert to categorical and normalize spelling
>> cnts=countcats(ync,2)                    % count responses on 2nd dimension
cnts =
   1     2
   3     0
>> vote=cnts(:,2)>cnts(:,1);                  % see which is greater (Y>N --> True)
>> vote=categorical(vote,[true false],{'Yes','No'})  % convert to categorical to display
vote = 
   Yes 
   No 
>> yn      % original table to compare -- looks like right choice.
yn = 
  'yes'    'no'    'Yes'
  'no'     'No'    'NO' 
>>

NB: The above doesn't have the extra logic to check for tie--in case that is possible will need to test for == as well and add the third category of TIE as possible output.

ADDENDUM

If TIE is possible, look at computing difference between counts and then the SIGN function will generate the tri-state variable needed.

0 Commenti
Mostra -2 commenti meno recenti Nascondi -2 commenti meno recenti

Accedi per commentare.

Answer 3

Sarah Palfreyman il 30 Apr 2018

0 voti

Try tokenizing with Text Analytics Toolbox and you can easily get a histogram count.

0 Commenti
Mostra -2 commenti meno recenti Nascondi -2 commenti meno recenti

Accedi per commentare.

most frequent word in cell array

0 Commenti
Mostra -2 commenti meno recenti Nascondi -2 commenti meno recenti

Risposta accettata

4 Commenti
Mostra 2 commenti meno recenti Nascondi 2 commenti meno recenti

Più risposte (2)

0 Commenti
Mostra -2 commenti meno recenti Nascondi -2 commenti meno recenti

0 Commenti
Mostra -2 commenti meno recenti Nascondi -2 commenti meno recenti

Categorie

Prodotti

Tag

Community Treasure Hunt

most frequent word in cell array

0 Commenti Mostra -2 commenti meno recenti Nascondi -2 commenti meno recenti

Risposta accettata

4 Commenti Mostra 2 commenti meno recenti Nascondi 2 commenti meno recenti

Più risposte (2)

0 Commenti Mostra -2 commenti meno recenti Nascondi -2 commenti meno recenti

0 Commenti Mostra -2 commenti meno recenti Nascondi -2 commenti meno recenti

Categorie

Prodotti

Tag

Vedere anche

Community Treasure Hunt

0 Commenti
Mostra -2 commenti meno recenti Nascondi -2 commenti meno recenti

4 Commenti
Mostra 2 commenti meno recenti Nascondi 2 commenti meno recenti

0 Commenti
Mostra -2 commenti meno recenti Nascondi -2 commenti meno recenti

0 Commenti
Mostra -2 commenti meno recenti Nascondi -2 commenti meno recenti