Predictor-Estimator training¶
Contents
usage: kiwi train [-h] --train-source TRAIN_SOURCE
[--train-target TRAIN_TARGET]
[--train-source-tags TRAIN_SOURCE_TAGS]
[--train-target-tags TRAIN_TARGET_TAGS]
[--train-pe TRAIN_PE]
[--train-sentence-scores TRAIN_SENTENCE_SCORES]
[--split SPLIT] [--valid-source VALID_SOURCE]
[--valid-target VALID_TARGET]
[--valid-alignments VALID_ALIGNMENTS]
[--valid-source-tags VALID_SOURCE_TAGS]
[--valid-target-tags VALID_TARGET_TAGS]
[--valid-pe VALID_PE]
[--valid-sentence-scores VALID_SENTENCE_SCORES]
[--predict-side {tags,source_tags,gap_tags}]
[--wmt18-format [WMT18_FORMAT]]
[--source-max-length SOURCE_MAX_LENGTH]
[--source-min-length SOURCE_MIN_LENGTH]
[--target-max-length TARGET_MAX_LENGTH]
[--target-min-length TARGET_MIN_LENGTH]
[--source-vocab-size SOURCE_VOCAB_SIZE]
[--target-vocab-size TARGET_VOCAB_SIZE]
[--source-vocab-min-frequency SOURCE_VOCAB_MIN_FREQUENCY]
[--target-vocab-min-frequency TARGET_VOCAB_MIN_FREQUENCY]
[--extend-source-vocab EXTEND_SOURCE_VOCAB]
[--extend-target-vocab EXTEND_TARGET_VOCAB]
[--warmup WARMUP] [--rnn-layers-pred RNN_LAYERS_PRED]
[--dropout-pred DROPOUT_PRED] [--hidden-pred HIDDEN_PRED]
[--out-embeddings-size OUT_EMBEDDINGS_SIZE]
[--embedding-sizes EMBEDDING_SIZES]
[--share-embeddings [SHARE_EMBEDDINGS]]
[--predict-inverse [PREDICT_INVERSE]]
[--source-embeddings-size SOURCE_EMBEDDINGS_SIZE]
[--target-embeddings-size TARGET_EMBEDDINGS_SIZE]
[--start-stop [START_STOP]] [--predict-gaps [PREDICT_GAPS]]
[--predict-target [PREDICT_TARGET]]
[--predict-source [PREDICT_SOURCE]]
[--load-pred-source LOAD_PRED_SOURCE]
[--load-pred-target LOAD_PRED_TARGET]
[--rnn-layers-est RNN_LAYERS_EST]
[--dropout-est DROPOUT_EST] [--hidden-est HIDDEN_EST]
[--mlp-est [MLP_EST]] [--sentence-level [SENTENCE_LEVEL]]
[--sentence-ll [SENTENCE_LL]]
[--sentence-ll-predict-mean [SENTENCE_LL_PREDICT_MEAN]]
[--use-probs [USE_PROBS]] [--binary-level [BINARY_LEVEL]]
[--token-level [TOKEN_LEVEL]]
[--target-bad-weight TARGET_BAD_WEIGHT]
[--gaps-bad-weight GAPS_BAD_WEIGHT]
[--source-bad-weight SOURCE_BAD_WEIGHT]
data¶
--train-source | Path to training source file |
--train-target | Path to training target file |
--train-source-tags | |
Path to validation label file for source (WMT18 format) | |
--train-target-tags | |
Path to validation label file for target | |
--train-pe | Path to file containing post-edited target. |
--train-sentence-scores | |
Path to file containing sentence level scores. |
validation data¶
--split | Split Train dataset in case that no validation set is given. |
--valid-source | Path to validation source file |
--valid-target | Path to validation target file |
--valid-alignments | |
Path to valid alignments between source and target. | |
--valid-source-tags | |
Path to validation label file for source (WMT18 format) | |
--valid-target-tags | |
Path to validation label file for target | |
--valid-pe | Path to file containing postedited target. |
--valid-sentence-scores | |
Path to file containing sentence level scores. |
data processing options¶
--predict-side | Possible choices: tags, source_tags, gap_tags Tagset to predict. Leave unchanged for WMT17 format. Default: “tags” |
--wmt18-format | Read target tags in WMT18 format. Default: False |
--source-max-length | |
Maximum source sequence length Default: inf | |
--source-min-length | |
Truncate source sequence length. Default: 0 | |
--target-max-length | |
Maximum target sequence length to keep. Default: inf | |
--target-min-length | |
Truncate target sequence length. Default: 0 |
vocabulary options¶
Options for loading vocabulary from a previous run. This is used for e.g. training a source predictor via predict-inverse: True ; If set, other vocab options are ignored
--source-vocab-size | |
Size of the source vocabulary. | |
--target-vocab-size | |
Size of the target vocabulary. | |
--source-vocab-min-frequency | |
Min word frequency for source vocabulary. Default: 1 | |
--target-vocab-min-frequency | |
Min word frequency for target vocabulary. Default: 1 |
PredEst data¶
Predictor Estimator specific data options. (POSTECH)
--extend-source-vocab | |
Optionally load more data which is used only for vocabulary creation. Path to additional Data(Predictor) | |
--extend-target-vocab | |
Optionally load more data which is used only for vocabulary creation. Path to additional Data(Predictor) |
predictor training¶
Predictor Estimator (POSTECH)
--warmup | Pretrain Predictor for this number of steps. Default: 0 |
--rnn-layers-pred | |
Layers in Pred RNN Default: 2 | |
--dropout-pred | Dropout in predictor Default: 0.0 |
--hidden-pred | Size of hidden layers in LSTM Default: 100 |
--out-embeddings-size | |
Word Embedding in Output layer Default: 200 | |
--embedding-sizes | |
If set, takes precedence over other embedding params Default: 0 | |
--share-embeddings | |
Tie input and output embeddings for target. Default: False | |
--predict-inverse | |
Predict target -> source instead of source -> target. Default: False |
model-embeddings¶
Embedding layers size in case pre-trained embeddings are not used.
--source-embeddings-size | |
Word embedding size for source. Default: 50 | |
--target-embeddings-size | |
Word embedding size for target. Default: 50 |
predictor-estimator training¶
Predictor Estimator (POSTECH). These settings are used to train the Predictor. They will be ignored if training a Predictor-Estimator and the load-model flag is set.
--start-stop | Append start and stop symbols to estimator feature sequence. Default: False |
--predict-gaps | Predict Gap Tags. Requires train-gap-tags, valid-gap-tags to be set. Default: False |
--predict-target | |
Predict Target Tags. Requires train-target-tags, valid-target-tags to be set. Default: True | |
--predict-source | |
Predict Source Tags. Requires train-source-tags, valid-source-tags to be set. Default: False | |
--load-pred-source | |
If set, model architecture and vocabulary parameters are ignored. Load pretrained predictor tgt->src. | |
--load-pred-target | |
If set, model architecture and vocabulary parameters are ignored. Load pretrained predictor src->tgt. | |
--rnn-layers-est | |
Layers in Estimator RNN Default: 2 | |
--dropout-est | Dropout in estimator Default: 0.0 |
--hidden-est | Size of hidden layers in LSTM Default: 100 |
--mlp-est |
Default: False |
--sentence-level | |
Default: False | |
--sentence-ll |
Default: False |
--sentence-ll-predict-mean | |
Default: False | |
--use-probs | Predict scores as product/sum of word level probs Default: False |
--binary-level |
Default: False |
--token-level |
Default: False |
--target-bad-weight | |
Relative weight for target bad labels. Default: 3.0 | |
--gaps-bad-weight | |
Relative weight for gaps bad labels. Default: 3.0 | |
--source-bad-weight | |
Relative weight for source bad labels. Default: 3.0 |