kiwi.models package¶
Subpackages¶
- kiwi.models.linear package
- Submodules
- kiwi.models.linear.label_dictionary module
- kiwi.models.linear.linear_model module
- kiwi.models.linear.linear_trainer module
- kiwi.models.linear.linear_word_qe_decoder module
- kiwi.models.linear.linear_word_qe_features module
- kiwi.models.linear.linear_word_qe_sentence module
- kiwi.models.linear.sequence_parts module
- kiwi.models.linear.sparse_feature_vector module
- kiwi.models.linear.sparse_vector module
- kiwi.models.linear.structured_classifier module
- kiwi.models.linear.structured_decoder module
- kiwi.models.linear.utils module
- Module contents
- kiwi.models.modules package
Submodules¶
kiwi.models.linear_word_qe_classifier module¶
This is the main script for the linear sequential word-based quality estimator.
-
class
kiwi.models.linear_word_qe_classifier.
LinearWordQEClassifier
(use_basic_features_only=True, use_bigrams=True, use_simple_bigram_features=True, use_parse_features=False, use_stacked_features=False, evaluation_metric='f1_bad', cost_false_positives=0.5, cost_false_negatives=0.5)[source]¶ Bases:
kiwi.models.linear.structured_classifier.StructuredClassifier
Main class for the word-level quality estimator. Inherits from a general structured classifier.
-
create_instances
(dataset)[source]¶ Preprocess the dataset if needed to create instances. Default is returning the dataset itself. Override if needed.
-
create_prediction
(instance, parts, predicted_output)[source]¶ Creates a list of word-level predictions for a sentence. For compliance with probabilities, it returns 1 if label is BAD, 0 if OK.
-
evaluate
(instances, predictions, print_scores=True)[source]¶ Evaluates the model’s accuracy and F1-BAD score.
-
get_coarse_label
(label)[source]¶ Get the coarse part of a fine-grained label. The coarse label is the prefix before the underscore (if any). For example, the coarse part of BAD_SUB, BAD_DEL, and BAD is BAD.
-
label_instance
(instance, parts, predicted_output)[source]¶ Return a labeled instance by adding the predicted output information.
-
title
= 'Linear Model'¶
-
kiwi.models.model module¶
-
class
kiwi.models.model.
Model
(vocabs, ConfigCls=<class 'kiwi.models.model.ModelConfig'>, config=None, **kwargs)[source]¶ Bases:
torch.nn.modules.module.Module
-
forward
(*args, **kwargs)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
get_mask
(batch, output)[source]¶ Compute Mask of Tokens for side.
Parameters: - batch – Namespace of tensors
- side – String identifier.
-
preprocess
(examples)[source]¶ Preprocess Raw Data.
Parameters: examples (list of dict) – List of examples. Each Example is a dict with field strings as keys, and unnumericalized, tokenized data as values. Returns: A batch object.
-
subclasses
= {'Estimator': <class 'kiwi.models.predictor_estimator.Estimator'>, 'NuQE': <class 'kiwi.models.nuqe.NuQE'>, 'Predictor': <class 'kiwi.models.predictor.Predictor'>, 'QUETCH': <class 'kiwi.models.quetch.QUETCH'>}¶
-
-
class
kiwi.models.model.
ModelConfig
(vocabs)[source]¶ Bases:
object
kiwi.models.nuqe module¶
-
class
kiwi.models.nuqe.
NuQE
(vocabs, **kwargs)[source]¶ Bases:
kiwi.models.quetch.QUETCH
Neural Quality Estimation (NuQE) model for word level quality estimation.
-
forward
(batch)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
title
= 'NuQE'¶
-
kiwi.models.predictor module¶
-
class
kiwi.models.predictor.
Predictor
(vocabs, **kwargs)[source]¶ Bases:
kiwi.models.model.Model
Bidirectional Conditional Language Model
- Implemented after Kim et al 2017, see:
- http://www.statmt.org/wmt17/pdf/WMT63.pdf
-
forward
(batch, source_side=None, target_side=None)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
title
= 'PredEst Predictor model (an embedder model)'¶
kiwi.models.predictor_estimator module¶
-
class
kiwi.models.predictor_estimator.
Estimator
(vocabs, predictor_tgt=None, predictor_src=None, **kwargs)[source]¶ Bases:
kiwi.models.model.Model
-
forward
(batch)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
static
from_options
(vocabs, opts)[source]¶ Parameters: - vocabs –
- opts –
predict_target (bool): Predict target tags predict_source (bool): Predict source tags predict_gaps (bool): Predict gap tags token_level (bool): Train predictor using PE field. sentence_level (bool): Predict Sentence Scores sentence_ll (bool): Use likelihood loss for sentence scores
(instead of squared error)binary_level: Predict binary sentence labels target_bad_weight: Weight for target tags bad class. Default 3.0 source_bad_weight: Weight for source tags bad class. Default 3.0 gaps_bad_weight: Weight for gap tags bad class. Default 3.0
Returns:
Compute Tag Predictions.
-
title
= 'PredEst (Predictor-Estimator)'¶
-
-
class
kiwi.models.predictor_estimator.
EstimatorConfig
(vocabs, hidden_est=100, rnn_layers_est=1, mlp_est=True, dropout_est=0.0, start_stop=False, predict_target=True, predict_gaps=False, predict_source=False, token_level=True, sentence_level=True, sentence_ll=True, binary_level=True, target_bad_weight=2.0, source_bad_weight=2.0, gaps_bad_weight=2.0, **kwargs)[source]¶
kiwi.models.quetch module¶
-
class
kiwi.models.quetch.
QUETCH
(vocabs, **kwargs)[source]¶ Bases:
kiwi.models.model.Model
QUality Estimation from scraTCH (QUETCH) model.
TODO: add references.
-
forward
(batch)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
title
= 'QUETCH'¶
-
kiwi.models.utils module¶
-
class
kiwi.models.utils.
GradientMul
[source]¶ Bases:
torch.autograd.function.Function
-
static
backward
(ctx, grad)[source]¶ Defines a formula for differentiating the operation.
This function is to be overridden by all subclasses.
It must accept a context
ctx
as the first argument, followed by as many outputs didforward()
return, and it should return as many tensors, as there were inputs toforward()
. Each argument is the gradient w.r.t the given output, and each returned value should be the gradient w.r.t. the corresponding input.The context can be used to retrieve tensors saved during the forward pass. It also has an attribute
ctx.needs_input_grad
as a tuple of booleans representing whether each input needs gradient. E.g.,backward()
will havectx.needs_input_grad[0] = True
if the first input toforward()
needs gradient computated w.r.t. the output.
-
static
forward
(ctx, x, constant=0)[source]¶ Performs the operation.
This function is to be overridden by all subclasses.
It must accept a context ctx as the first argument, followed by any number of arguments (tensors or other types).
The context can be used to store tensors that can be then retrieved during the backward pass.
-
static
-
kiwi.models.utils.
align_source
(source, trg2src_alignments, max_aligned, unaligned_idx, padding_idx, pad_size)[source]¶
-
kiwi.models.utils.
align_tensor
(tensor, alignments, max_aligned, unaligned_idx, padding_idx, pad_size, target_length=None)[source]¶
-
kiwi.models.utils.
apply_packed_sequence
(rnn, embedding, lengths)[source]¶ Runs a forward pass of embeddings through an rnn using packed sequence. :param rnn: The RNN that that we want to compute a forward pass with. :param embedding: A batch of sequence embeddings. :type embedding: FloatTensor b x seq x dim :param lengths: The length of each sequence in the batch. :type lengths: LongTensor batch
Returns: The output of the RNN rnn with input embedding Return type: output
-
kiwi.models.utils.
convolve_tensor
(sequences, window_size, pad_value=0)[source]¶ Convolve a sequence and apply padding
Parameters: - sequence – 2D tensor
- window_size – filter length
- pad_value – int value used as padding
Returns: 3D tensor, where the last dimension has size window_size
-
kiwi.models.utils.
gradient_mul
()¶
-
kiwi.models.utils.
make_loss_weights
(nb_classes, target_idx, weight)[source]¶ Creates a loss weight vector for nn.CrossEntropyLoss
Parameters: - nb_classes – Number of classes
- target_idx – ID of the target (reweighted) class
- weight – Weight of the target class
Returns: - Weight Tensor of shape nb_classes such that
weights[target_idx] = weight weights[other_idx] = 1.0
Return type: weights (FloatTensor)
-
kiwi.models.utils.
map_alignments_to_target
(src2tgt_alignments, target_length=None)[source]¶ Maps a target index to a list of source indexes.
Parameters: - src2tgt_alignments (list) – list of tuples with source, target indexes.
- target_length – size of the target side; if None, the highest index in the alignments is used.
Returns: A list of size target_length where position i refers to the i-th target token and contains a list of source indexes aligned to it.