Predictor training¶
This is used to pre-train the predictor side of the predictor-estimator model.
Contents
usage: kiwi train [-h] --train-source TRAIN_SOURCE
[--train-target TRAIN_TARGET]
[--train-source-tags TRAIN_SOURCE_TAGS]
[--train-target-tags TRAIN_TARGET_TAGS]
[--train-pe TRAIN_PE]
[--train-sentence-scores TRAIN_SENTENCE_SCORES]
[--split SPLIT] [--valid-source VALID_SOURCE]
[--valid-target VALID_TARGET]
[--valid-alignments VALID_ALIGNMENTS]
[--valid-source-tags VALID_SOURCE_TAGS]
[--valid-target-tags VALID_TARGET_TAGS]
[--valid-pe VALID_PE]
[--valid-sentence-scores VALID_SENTENCE_SCORES]
[--predict-side {tags,source_tags,gap_tags}]
[--wmt18-format [WMT18_FORMAT]]
[--source-max-length SOURCE_MAX_LENGTH]
[--source-min-length SOURCE_MIN_LENGTH]
[--target-max-length TARGET_MAX_LENGTH]
[--target-min-length TARGET_MIN_LENGTH]
[--source-vocab-size SOURCE_VOCAB_SIZE]
[--target-vocab-size TARGET_VOCAB_SIZE]
[--source-vocab-min-frequency SOURCE_VOCAB_MIN_FREQUENCY]
[--target-vocab-min-frequency TARGET_VOCAB_MIN_FREQUENCY]
[--extend-source-vocab EXTEND_SOURCE_VOCAB]
[--extend-target-vocab EXTEND_TARGET_VOCAB]
[--warmup WARMUP] [--rnn-layers-pred RNN_LAYERS_PRED]
[--dropout-pred DROPOUT_PRED] [--hidden-pred HIDDEN_PRED]
[--out-embeddings-size OUT_EMBEDDINGS_SIZE]
[--embedding-sizes EMBEDDING_SIZES]
[--share-embeddings [SHARE_EMBEDDINGS]]
[--predict-inverse [PREDICT_INVERSE]]
[--source-embeddings-size SOURCE_EMBEDDINGS_SIZE]
[--target-embeddings-size TARGET_EMBEDDINGS_SIZE]
data¶
--train-source | Path to training source file |
--train-target | Path to training target file |
--train-source-tags | |
Path to validation label file for source (WMT18 format) | |
--train-target-tags | |
Path to validation label file for target | |
--train-pe | Path to file containing post-edited target. |
--train-sentence-scores | |
Path to file containing sentence level scores. |
validation data¶
--split | Split Train dataset in case that no validation set is given. |
--valid-source | Path to validation source file |
--valid-target | Path to validation target file |
--valid-alignments | |
Path to valid alignments between source and target. | |
--valid-source-tags | |
Path to validation label file for source (WMT18 format) | |
--valid-target-tags | |
Path to validation label file for target | |
--valid-pe | Path to file containing postedited target. |
--valid-sentence-scores | |
Path to file containing sentence level scores. |
data processing options¶
--predict-side | Possible choices: tags, source_tags, gap_tags Tagset to predict. Leave unchanged for WMT17 format. Default: “tags” |
--wmt18-format | Read target tags in WMT18 format. Default: False |
--source-max-length | |
Maximum source sequence length Default: inf | |
--source-min-length | |
Truncate source sequence length. Default: 0 | |
--target-max-length | |
Maximum target sequence length to keep. Default: inf | |
--target-min-length | |
Truncate target sequence length. Default: 0 |
vocabulary options¶
Options for loading vocabulary from a previous run. This is used for e.g. training a source predictor via predict-inverse: True ; If set, other vocab options are ignored
--source-vocab-size | |
Size of the source vocabulary. | |
--target-vocab-size | |
Size of the target vocabulary. | |
--source-vocab-min-frequency | |
Min word frequency for source vocabulary. Default: 1 | |
--target-vocab-min-frequency | |
Min word frequency for target vocabulary. Default: 1 |
PredEst data¶
Predictor Estimator specific data options. (POSTECH)
--extend-source-vocab | |
This is useful to reduce OOV words if the parallel data and QE data are from different domains. |
--extend-source-vocab | |
Optionally load more data which is used only for vocabulary creation. Path to additional Data(Predictor) | |
--extend-target-vocab | |
Optionally load more data which is used only for vocabulary creation. Path to additional Data(Predictor) |
predictor training¶
Predictor Estimator (POSTECH)
--warmup | Pretrain Predictor for this number of steps. Default: 0 |
--rnn-layers-pred | |
Layers in Pred RNN Default: 2 | |
--dropout-pred | Dropout in predictor Default: 0.0 |
--hidden-pred | Size of hidden layers in LSTM Default: 100 |
--out-embeddings-size | |
Word Embedding in Output layer Default: 200 | |
--embedding-sizes | |
If set, takes precedence over other embedding params Default: 0 | |
--share-embeddings | |
Tie input and output embeddings for target. Default: False | |
--predict-inverse | |
Predict target -> source instead of source -> target. Default: False |
model-embeddings¶
Embedding layers size in case pre-trained embeddings are not used.
--source-embeddings-size | |
Word embedding size for source. Default: 50 | |
--target-embeddings-size | |
Word embedding size for target. Default: 50 |