kiwi.systems.encoders.predictor

Module Contents

Classes

DualSequencesEncoder

Base class for all neural network modules.

PredictorEncoder

Bidirectional Conditional Language Model

kiwi.systems.encoders.predictor.logger
class kiwi.systems.encoders.predictor.DualSequencesEncoder(input_size_a, input_size_b, hidden_size, output_size, num_layers, dropout, _use_v0_buggy_strategy=False)

Bases: torch.nn.Module

Base class for all neural network modules.

Your models should also subclass this class.

Modules can also contain other Modules, allowing to nest them in a tree structure. You can assign the submodules as regular attributes:

import torch.nn as nn
import torch.nn.functional as F

class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.conv1 = nn.Conv2d(1, 20, 5)
        self.conv2 = nn.Conv2d(20, 20, 5)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        return F.relu(self.conv2(x))

Submodules assigned in this way will be registered, and will have their parameters converted too when you call to(), etc.

forward(self, embeddings_a, lengths_a, mask_a, embeddings_b, lengths_b)
contextualize_b(self, embeddings, lengths, hidden)
encode_b(self, embeddings, forward_contexts, backward_contexts, contexts_a, attention_mask)

Encode sequence B.

Build a feature vector for each position i using left context i-1 and right context i+1. In the original implementation, this resulted in a returned tensor with -2 timesteps (dim=1). We have now changed it to return the same number of timesteps as the input. The consequence is that callers now have to deal with BOS and EOS in a different way, but hopefully this new behaviour is more consistent and less surprising. The old behaviour can be forced by setting self._use_v0_buggy_strategy to True.

static _reverse_padded_seq(lengths, sequence)

Reverse a batch of padded sequences of different length.

static _split_hidden(hidden)

Split hidden state into forward/backward parts.

class kiwi.systems.encoders.predictor.PredictorEncoder(vocabs: Dict[str, Vocabulary], config: Config, pretraining: bool = False, pre_load_model: bool = True)

Bases: kiwi.systems._meta_module.MetaModule

Bidirectional Conditional Language Model

Implemented after Kim et al 2017, see: http://www.statmt.org/wmt17/pdf/WMT63.pdf

class Config

Bases: kiwi.utils.io.BaseConfig

Base class for all pydantic configs. Used to configure base behaviour of configs.

hidden_size :int = 400

Size of hidden layers in LSTM.

rnn_layers :int = 3

Number of RNN layers in the Predictor.

dropout :float = 0.0
share_embeddings :bool = False

Tie input and output embeddings for target.

out_embeddings_dim :Optional[int]

Word Embedding in Output layer.

use_mismatch_features :bool = False

Whether to use Alibaba’s mismatch features.

embeddings :InputEmbeddingsConfig
use_v0_buggy_strategy :bool = False

The Predictor implementation in Kiwi<=0.3.4 had a bug in applying the LSTM to encode source (it used lengths too short by 2) and in reversing the target embeddings for applying the backward LSTM (also short by 2). This flag is set to true when loading a saved model from those versions.

v0_start_stop :bool = False

Whether pre_qe_f_v is padded on both ends or post_qe_f_v is strip on both ends.

dropout_on_rnns(cls, v, values)
no_implementation(cls, v)
classmethod input_data_encoders(cls, config: Config)
size(self, field=None)
forward(self, batch_inputs, include_target_logits=False)