Predictor-Estimator training¶
Contents
usage: kiwi train [-h] --train-source TRAIN_SOURCE
                  [--train-target TRAIN_TARGET]
                  [--train-source-tags TRAIN_SOURCE_TAGS]
                  [--train-target-tags TRAIN_TARGET_TAGS]
                  [--train-pe TRAIN_PE]
                  [--train-sentence-scores TRAIN_SENTENCE_SCORES]
                  [--split SPLIT] [--valid-source VALID_SOURCE]
                  [--valid-target VALID_TARGET]
                  [--valid-alignments VALID_ALIGNMENTS]
                  [--valid-source-tags VALID_SOURCE_TAGS]
                  [--valid-target-tags VALID_TARGET_TAGS]
                  [--valid-pe VALID_PE]
                  [--valid-sentence-scores VALID_SENTENCE_SCORES]
                  [--predict-side {tags,source_tags,gap_tags}]
                  [--wmt18-format [WMT18_FORMAT]]
                  [--source-max-length SOURCE_MAX_LENGTH]
                  [--source-min-length SOURCE_MIN_LENGTH]
                  [--target-max-length TARGET_MAX_LENGTH]
                  [--target-min-length TARGET_MIN_LENGTH]
                  [--source-vocab-size SOURCE_VOCAB_SIZE]
                  [--target-vocab-size TARGET_VOCAB_SIZE]
                  [--source-vocab-min-frequency SOURCE_VOCAB_MIN_FREQUENCY]
                  [--target-vocab-min-frequency TARGET_VOCAB_MIN_FREQUENCY]
                  [--extend-source-vocab EXTEND_SOURCE_VOCAB]
                  [--extend-target-vocab EXTEND_TARGET_VOCAB]
                  [--warmup WARMUP] [--rnn-layers-pred RNN_LAYERS_PRED]
                  [--dropout-pred DROPOUT_PRED] [--hidden-pred HIDDEN_PRED]
                  [--out-embeddings-size OUT_EMBEDDINGS_SIZE]
                  [--embedding-sizes EMBEDDING_SIZES]
                  [--share-embeddings [SHARE_EMBEDDINGS]]
                  [--predict-inverse [PREDICT_INVERSE]]
                  [--source-embeddings-size SOURCE_EMBEDDINGS_SIZE]
                  [--target-embeddings-size TARGET_EMBEDDINGS_SIZE]
                  [--start-stop [START_STOP]] [--predict-gaps [PREDICT_GAPS]]
                  [--predict-target [PREDICT_TARGET]]
                  [--predict-source [PREDICT_SOURCE]]
                  [--load-pred-source LOAD_PRED_SOURCE]
                  [--load-pred-target LOAD_PRED_TARGET]
                  [--rnn-layers-est RNN_LAYERS_EST]
                  [--dropout-est DROPOUT_EST] [--hidden-est HIDDEN_EST]
                  [--mlp-est [MLP_EST]] [--sentence-level [SENTENCE_LEVEL]]
                  [--sentence-ll [SENTENCE_LL]]
                  [--sentence-ll-predict-mean [SENTENCE_LL_PREDICT_MEAN]]
                  [--use-probs [USE_PROBS]] [--binary-level [BINARY_LEVEL]]
                  [--token-level [TOKEN_LEVEL]]
                  [--target-bad-weight TARGET_BAD_WEIGHT]
                  [--gaps-bad-weight GAPS_BAD_WEIGHT]
                  [--source-bad-weight SOURCE_BAD_WEIGHT]
data¶
| --train-source | Path to training source file | 
| --train-target | Path to training target file | 
| --train-source-tags | |
| Path to validation label file for source (WMT18 format) | |
| --train-target-tags | |
| Path to validation label file for target | |
| --train-pe | Path to file containing post-edited target. | 
| --train-sentence-scores | |
| Path to file containing sentence level scores. | |
validation data¶
| --split | Split Train dataset in case that no validation set is given. | 
| --valid-source | Path to validation source file | 
| --valid-target | Path to validation target file | 
| --valid-alignments | |
| Path to valid alignments between source and target. | |
| --valid-source-tags | |
| Path to validation label file for source (WMT18 format) | |
| --valid-target-tags | |
| Path to validation label file for target | |
| --valid-pe | Path to file containing postedited target. | 
| --valid-sentence-scores | |
| Path to file containing sentence level scores. | |
data processing options¶
| --predict-side | Possible choices: tags, source_tags, gap_tags Tagset to predict. Leave unchanged for WMT17 format. Default: “tags”  | 
| --wmt18-format | Read target tags in WMT18 format. Default: False  | 
| --source-max-length | |
Maximum source sequence length Default: inf  | |
| --source-min-length | |
Truncate source sequence length. Default: 0  | |
| --target-max-length | |
Maximum target sequence length to keep. Default: inf  | |
| --target-min-length | |
Truncate target sequence length. Default: 0  | |
vocabulary options¶
Options for loading vocabulary from a previous run. This is used for e.g. training a source predictor via predict-inverse: True ; If set, other vocab options are ignored
| --source-vocab-size | |
| Size of the source vocabulary. | |
| --target-vocab-size | |
| Size of the target vocabulary. | |
| --source-vocab-min-frequency | |
Min word frequency for source vocabulary. Default: 1  | |
| --target-vocab-min-frequency | |
Min word frequency for target vocabulary. Default: 1  | |
PredEst data¶
Predictor Estimator specific data options. (POSTECH)
| --extend-source-vocab | |
| Optionally load more data which is used only for vocabulary creation. Path to additional Data(Predictor) | |
| --extend-target-vocab | |
| Optionally load more data which is used only for vocabulary creation. Path to additional Data(Predictor) | |
predictor training¶
Predictor Estimator (POSTECH)
| --warmup | Pretrain Predictor for this number of steps. Default: 0  | 
| --rnn-layers-pred | |
Layers in Pred RNN Default: 2  | |
| --dropout-pred | Dropout in predictor Default: 0.0  | 
| --hidden-pred | Size of hidden layers in LSTM Default: 100  | 
| --out-embeddings-size | |
Word Embedding in Output layer Default: 200  | |
| --embedding-sizes | |
If set, takes precedence over other embedding params Default: 0  | |
| --share-embeddings | |
Tie input and output embeddings for target. Default: False  | |
| --predict-inverse | |
Predict target -> source instead of source -> target. Default: False  | |
model-embeddings¶
Embedding layers size in case pre-trained embeddings are not used.
| --source-embeddings-size | |
Word embedding size for source. Default: 50  | |
| --target-embeddings-size | |
Word embedding size for target. Default: 50  | |
predictor-estimator training¶
Predictor Estimator (POSTECH). These settings are used to train the Predictor. They will be ignored if training a Predictor-Estimator and the load-model flag is set.
| --start-stop | Append start and stop symbols to estimator feature sequence. Default: False  | 
| --predict-gaps | Predict Gap Tags. Requires train-gap-tags, valid-gap-tags to be set. Default: False  | 
| --predict-target | |
Predict Target Tags. Requires train-target-tags, valid-target-tags to be set. Default: True  | |
| --predict-source | |
Predict Source Tags. Requires train-source-tags, valid-source-tags to be set. Default: False  | |
| --load-pred-source | |
| If set, model architecture and vocabulary parameters are ignored. Load pretrained predictor tgt->src. | |
| --load-pred-target | |
| If set, model architecture and vocabulary parameters are ignored. Load pretrained predictor src->tgt. | |
| --rnn-layers-est | |
Layers in Estimator RNN Default: 2  | |
| --dropout-est | Dropout in estimator Default: 0.0  | 
| --hidden-est | Size of hidden layers in LSTM Default: 100  | 
| --mlp-est | 
 Default: False  | 
| --sentence-level | |
 Default: False  | |
| --sentence-ll | 
 Default: False  | 
| --sentence-ll-predict-mean | |
 Default: False  | |
| --use-probs | Predict scores as product/sum of word level probs Default: False  | 
| --binary-level | 
 Default: False  | 
| --token-level | 
 Default: False  | 
| --target-bad-weight | |
Relative weight for target bad labels. Default: 3.0  | |
| --gaps-bad-weight | |
Relative weight for gaps bad labels. Default: 3.0  | |
| --source-bad-weight | |
Relative weight for source bad labels. Default: 3.0  | |