encodeTokens
Syntax
Description
[
encodes tokenCodes,segments] = encodeTokens(tokenizer,tokens)tokens using the specified tokenizer and returns the token
codes and segments. This syntax automatically adds special tokens to the input.
[
encodes the sentence pair tokenCodes,segments] = encodeTokens(tokenizer,tokens1,tokens2)tokens1,tokens2. This syntax automatically adds
special tokens to the input.
[
also returns the mapping between the input and the encoded output.tokenCodes,segments,idx] = encodeTokens(___)
___ = encodeTokens(___,AddSpecialTokens=
specifies whether to add special tokens to the input.tf)
Examples
Input Arguments
Output Arguments
Algorithms
References
[1] Devlin, Jacob, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. "BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding" Preprint, submitted May 24, 2019. https://doi.org/10.48550/arXiv.1810.04805.
[2] Wu, Yonghui, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun et al. "Google's Neural Machine Translation System: Bridging the Gap Between Human and Machine Translation." Preprint, submitted October 8, 2016. https://doi.org/10.48550/arXiv.1609.08144
Version History
Introduced in R2023b