`kiwi.systems.encoders.xlmroberta`¶

Module Contents¶

Classes¶

`XLMRobertaTextEncoder`	Encode a field, handling vocabulary, tokenization and embeddings.
`XLMRobertaEncoder`	XLM-RoBERTa model, using HuggingFace’s implementation.

kiwi.systems.encoders.xlmroberta.logger¶

class kiwi.systems.encoders.xlmroberta.XLMRobertaTextEncoder(tokenizer_name='xlm-roberta-base', is_source=False)¶

Bases: kiwi.data.encoders.field_encoders.TextEncoder

Encode a field, handling vocabulary, tokenization and embeddings.

Heavily inspired in torchtext and torchnlp.

fit_vocab(self, samples, vocab_size=None, vocab_min_freq=0, embeddings_name=None, keep_rare_words_with_embeddings=False, add_embeddings_vocab=False)¶

class kiwi.systems.encoders.xlmroberta.XLMRobertaEncoder(vocabs: Dict[str, Vocabulary], config: Config, pre_load_model: bool = True)¶

Bases: kiwi.systems._meta_module.MetaModule

XLM-RoBERTa model, using HuggingFace’s implementation.

class Config¶

Bases: kiwi.utils.io.BaseConfig

Base class for all pydantic configs. Used to configure base behaviour of configs.

model_name :Union[str, Path] = xlm-roberta-base¶: Pre-trained XLMRoberta model to use.

interleave_input :bool = False¶: Concatenate SOURCE and TARGET without internal padding (111222000 instead of 111002220)

use_mlp :bool = True¶: Apply a linear layer on top of XLMRoberta.

hidden_size :int = 100¶: Size of the linear layer on top of XLMRoberta.

pooling :Literal['first_token', 'mean', 'll_mean', 'mixed'] = mixed¶: Type of pooling used to extract features from the encoder. Options are: first_token: CLS_token is used for sentence representation mean: Use avg pooling for sentence representation using scalar mixed layers ll_mean: Mean pool of only last layer embeddings mixed: Concat CLS token with mean_pool

scalar_mix_dropout :confloat(ge=0.0, le=1.0) = 0.1¶

scalar_mix_layer_norm :bool = True¶

freeze :bool = False¶: Freeze XLMRoberta during training.

freeze_for_number_of_steps :int = 0¶: Freeze XLMR during training for this number of steps.

fix_relative_path(cls, v)¶

load_state_dict(self, state_dict: Union[Dict[str, Tensor], Dict[str, Tensor]], strict: bool = True)¶

Copies parameters and buffers from state_dict into this module and its descendants. If strict is True, then the keys of state_dict must exactly match the keys returned by this module’s state_dict() function.

Parameters

state_dict (dict) – a dict containing parameters and persistent buffers.
strict (bool, optional) – whether to strictly enforce that the keys in state_dict match the keys returned by this module’s state_dict() function. Default: True

Returns

missing_keys is a list of str containing the missing keys
unexpected_keys is a list of str containing the unexpected keys

Return type

NamedTuple with missing_keys and unexpected_keys fields

classmethod input_data_encoders(cls, config: Config)¶

size(self, field=None)¶

_check_freezing(self)¶

forward(self, batch_inputs, *args, include_logits=False)¶

static concat_input(source_batch, target_batch, pad_id)¶

Concatenate tensors of two batches into one tensor.

Returns

the concatenation, a mask of types (a as zeroes and b as ones): and concatenation of attention_mask.

static split_outputs(features, batch_inputs, interleaved=False)¶

Split contexts to get tag_side outputs.

Parameters

features (tensor) – XLMRoberta output: <s> target </s> </s> source </s> Shape of (bs, 1 + target_len + 2 + source_len + 1, 2)
batch_inputs –
interleaved (bool) – whether the concat strategy was ‘interleaved’.

Returns

dict of tensors, one per tag side.

static interleave_input(source_batch, target_batch, pad_id)¶

Interleave the source + target embeddings into one tensor.

This means making the input as [batch, target [SEP] source].

Returns

interleave of embds, mask of target (as zeroes) and source (as ones): and concatenation of attention_mask

static get_mismatch_features(logits, target, pred)¶

kiwi.systems.encoders.xlm kiwi.systems.outputs

kiwi.systems.encoders.xlmroberta¶

Module Contents¶

Classes¶

`kiwi.systems.encoders.xlmroberta`¶