Least Frequent Words in document
Mostra commenti meno recenti
If I use topkwords to find the most-frequent words, what code can I use to show the 10-least frequent words?
1 Commento
KSSV
il 28 Nov 2018
Read about strfind, strcmp.
Risposte (1)
Hi,
I understand that you want to display the 10-least frequent words from a given set of words or sentences.
This can be achieved using the 'topkwords' function. Pass the input to 'topkwords', setting the k value to 'inf'. Then, sort the output of 'topkwords' in ascending order and display the top 10 words.
Refer to the sample code below for better understanding:
% Sample text data
textData = "This is a sample text. This text is for testing if our approach can display the least frequent words correctly or not";
% before using the ‘topkwords’ function, we need to convert the text into bag-of-words format
documents = tokenizedDocument(textData);
docs = bagOfWords(documents);
table = topkwords(docs, inf);
sortedTable = sortrows(table,'Count');
% Select the 10 least frequent words
numLeastFrequent = 10;
leastFrequentWords = sortedTable.Word(1:numLeastFrequent);
leastFrequentCounts = sortedTable.Count(1:numLeastFrequent);
% Display the 10 least frequent words and their counts
disp(leastFrequentWords);
Refer to the following documentations for more details:
- https://www.mathworks.com/help/textanalytics/ref/bagofwords.topkwords.html
- https://www.mathworks.com/help/textanalytics/ref/bagofwords.html
- https://www.mathworks.com/help/matlab/ref/double.sortrows.html
- https://www.mathworks.com/help/textanalytics/ref/tokenizeddocument.html
Hope this helps.
Categorie
Scopri di più su Characters and Strings in Centro assistenza e File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!