kiwi.data.vocabulary
Vocabulary
Define a vocabulary object that will be used to numericalize a field.
kiwi.data.vocabulary.
logger
counter
A collections.Counter object holding the frequencies of tokens in the data used to build the Vocab.
stoi
A dictionary mapping token strings to numerical identifiers; NOTE: use token_to_id() to do the conversion.
token_to_id()
itos
A list of token strings indexed by their numerical identifiers; NOTE: use id_to_token() to do the conversion.
id_to_token()
token_to_id
id_to_token
pad_id
bos_id
eos_id
__len__
net_length
max_size
Limit the vocabulary size.
The assumption here is that the vocabulary was created from a list of tokens sorted by descending frequency.
__getstate__
__setstate__
kiwi.data.tokenizers
kiwi.lib