kiwi.models.linear package¶
Submodules¶
kiwi.models.linear.label_dictionary module¶
This implements a dictionary of labels.
kiwi.models.linear.linear_model module¶
This implements a linear model.
-
class
kiwi.models.linear.linear_model.
LinearModel
[source]¶ Bases:
object
An abstract linear model.
-
compute_score_binary_features
(binary_features)[source]¶ Compute a score by taking the inner product with a binary feature vector.
-
kiwi.models.linear.linear_trainer module¶
A generic implementation of a basic trainer.
kiwi.models.linear.linear_word_qe_decoder module¶
Decoder for word-level quality estimation.
-
class
kiwi.models.linear.linear_word_qe_decoder.
LinearWordQEDecoder
(estimator, cost_false_positives=0.5, cost_false_negatives=0.5)[source]¶ Bases:
kiwi.models.linear.structured_decoder.StructuredDecoder
A decoder for word-level quality estimation.
-
decode_mira
(instance, parts, scores, gold_outputs, old_mira=False)[source]¶ Cost-augmented decoder. Allows a compromise between precision and recall. In general: p = a - (a+b)*z0 q = b*sum(z0) p’*z + q = a*sum(z) - (a+b)*z0’*z + b*sum(z0)
a => penalty for predicting 1 when it is 0 (FP) b => penalty for predicting 0 when it is 1 (FN)
F1: a = 0.5, b = 0.5 recall: a = 0, b = 1
-
decode_with_bigrams
(instance, parts, scores)[source]¶ Decoder for a sequential model (with bigrams).
-
decode_with_unigrams
(instance, parts, scores)[source]¶ Decoder for a non-sequential model (unigrams only).
-
run_viterbi
(initial_scores, transition_scores, final_scores, emission_scores)[source]¶ Computes the viterbi trellis for a given sequence. Receives: - Initial scores: (num_states) array - Transition scores: (length-1, num_states, num_states) array - Final scores: (num_states) array - Emission scores: (length, num_states) array.
-
kiwi.models.linear.linear_word_qe_features module¶
A class for handling features for word-level quality estimation.
-
class
kiwi.models.linear.linear_word_qe_features.
LinearWordQEFeatures
(use_basic_features_only=True, use_simple_bigram_features=True, use_parse_features=False, use_stacked_features=False, save_to_cache=False, load_from_cache=False, cached_features_file=None)[source]¶ Bases:
kiwi.models.linear.sparse_feature_vector.SparseFeatureVector
This class implements a feature vector for word-level quality estimation.
-
compute_bigram_features
(sentence_word_features, part)[source]¶ Compute bigram features (that depend on consecutive labels).
-
kiwi.models.linear.linear_word_qe_sentence module¶
-
class
kiwi.models.linear.linear_word_qe_sentence.
LinearWordQESentence
[source]¶ Bases:
object
Represents a sentence (word features and their labels).
-
create_from_sentence_pair
(source_words, target_words, alignments, source_pos_tags=None, target_pos_tags=None, target_parse_heads=None, target_parse_relations=None, target_ngram_left=None, target_ngram_right=None, target_stacked_features=None, labels=None)[source]¶ Creates an instance from source/target token and alignment information.
-
-
class
kiwi.models.linear.linear_word_qe_sentence.
LinearWordQETokenFeatures
(stacked_features=None, source_token_count=-1, target_token_count=-1, source_target_token_count_ratio=0.0, token='', left_context='', right_context='', first_aligned_token='', left_alignment='', right_alignment='', is_stopword=False, is_punctuation=False, is_proper_noun=False, is_digit=False, highest_order_ngram_left=-1, highest_order_ngram_right=-1, backoff_behavior_left=0.0, backoff_behavior_middle=0.0, backoff_behavior_right=0.0, source_highest_order_ngram_left=-1, source_highest_order_ngram_right=-1, pseudo_reference=False, target_pos='', target_morph='', target_head=-1, target_deprel='', aligned_source_pos_list='', polysemy_count_source=0, polysemy_count_target=0)[source]¶ Bases:
object
kiwi.models.linear.sequence_parts module¶
kiwi.models.linear.sparse_feature_vector module¶
This defines the class for defining sparse features in linear models.
-
class
kiwi.models.linear.sparse_feature_vector.
SparseBinaryFeatureVector
(feature_indices=None, save_to_cache=False, load_from_cache=False, cached_features_file=None)[source]¶ Bases:
list
A generic class for a sparse binary feature vector.
-
class
kiwi.models.linear.sparse_feature_vector.
SparseFeatureVector
(save_to_cache=False, load_from_cache=False, cached_features_file=None)[source]¶ Bases:
kiwi.models.linear.sparse_vector.SparseVector
A generic class for a sparse feature vector.
kiwi.models.linear.sparse_vector module¶
This defines a generic class for sparse vectors.
-
class
kiwi.models.linear.sparse_vector.
SparseVector
[source]¶ Bases:
dict
Implementation of a sparse vector using a dictionary.
-
dot_product
(vector)[source]¶ Computes the dot product with a given vector. Note: this iterates through the self vector, so it may be inefficient if the number of nonzeros in self is much larger than the number of nonzeros in vector. Hence the function reverts to vector.dot_product(self) if that is beneficial.
-
kiwi.models.linear.structured_classifier module¶
A generic implementation of an abstract structured linear classifier.
-
class
kiwi.models.linear.structured_classifier.
StructuredClassifier
[source]¶ Bases:
object
An abstract structured classifier.
-
compute_scores
(instance, parts, features)[source]¶ Compute a score for every part in the instance using the current model and the part-specific features.
-
create_instances
(dataset)[source]¶ Preprocess the dataset if needed to create instances. Default is returning the dataset itself. Override if needed.
-
evaluate
(instances, predictions, print_scores=True)[source]¶ Evaluate the structure classifier, computing a task-dependent evaluation metric.
-
kiwi.models.linear.structured_decoder module¶
-
class
kiwi.models.linear.structured_decoder.
StructuredDecoder
[source]¶ Bases:
object
An abstract decoder for structured prediction.
-
decode
(instance, parts, scores)[source]¶ Decode, computing the highest-scores output. Must return a vector of 0/1 predicted_outputs of the same size as parts.
-
kiwi.models.linear.utils module¶
Several utility functions.
-
kiwi.models.linear.utils.
nearly_binary_tol
(a, tol)[source]¶ Checks if a number is binary up to a tolerance.