newBag = removeNgrams(bag,idx)
specifies n-grams by numeric or logical indices in bag.Ngrams.
This syntax is the same as newBag =
removeNgrams(bag,bag.Ngrams(idx,:)).
Load the example data. The file sonnetsPreprocessed.txt contains preprocessed versions of Shakespeare's sonnets. The file contains one sonnet per line, with words separated by a space. Extract the text from sonnetsPreprocessed.txt, split the text into documents at newline characters, and then tokenize the documents.
Load the example data. The file sonnetsPreprocessed.txt contains preprocessed versions of Shakespeare's sonnets. The file contains one sonnet per line, with words separated by a space. Extract the text from sonnetsPreprocessed.txt, split the text into documents at newline characters, and then tokenize the documents.
Input bag-of-n-grams model, specified as a bagOfNgrams object.
N-grams to remove, specified as a string array, character vector, or a
cell array of character vectors.
If ngrams is a string array or cell array, then it has size NumNgrams-by-maxN , where NumNgrams is the number of n-grams, and maxN is the length of the largest n-gram. If ngrams is a character vector, then it represents a single word (unigram).
The value of ngrams(i,j) is the jth word of the ith n-gram. If the number of words in the ith n-gram is less than maxN, then the remaining entries of the ith row of ngrams are empty.
Example: ["An" ""; "An example"; "example"
""]
Data Types: string | char | cell
Indices of n-grams to remove, specified as a vector of numeric indices or
a vector of logical indices. The indices in idx
correspond to the rows of the bag.Ngrams.
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window.
Web browsers do not support MATLAB commands.
Seleziona un sito web
Seleziona un sito web per visualizzare contenuto tradotto dove disponibile e vedere eventi e offerte locali. In base alla tua area geografica, ti consigliamo di selezionare: .
Puoi anche selezionare un sito web dal seguente elenco:
Come ottenere le migliori prestazioni del sito
Per ottenere le migliori prestazioni del sito, seleziona il sito cinese (in cinese o in inglese). I siti MathWorks per gli altri paesi non sono ottimizzati per essere visitati dalla tua area geografica.