NuQE training

usage: kiwi train [-h] --train-source TRAIN_SOURCE --train-target TRAIN_TARGET
                  --train-alignments TRAIN_ALIGNMENTS
                  [--train-source-tags TRAIN_SOURCE_TAGS]
                  [--train-target-tags TRAIN_TARGET_TAGS] --valid-source
                  VALID_SOURCE --valid-target VALID_TARGET --valid-alignments
                  VALID_ALIGNMENTS [--valid-source-tags VALID_SOURCE_TAGS]
                  [--valid-target-tags VALID_TARGET_TAGS]
                  [--valid-source-pos VALID_SOURCE_POS]
                  [--valid-target-pos VALID_TARGET_POS]
                  [--predict-target [PREDICT_TARGET]]
                  [--predict-gaps [PREDICT_GAPS]]
                  [--predict-source [PREDICT_SOURCE]]
                  [--wmt18-format [WMT18_FORMAT]]
                  [--source-max-length SOURCE_MAX_LENGTH]
                  [--source-min-length SOURCE_MIN_LENGTH]
                  [--target-max-length TARGET_MAX_LENGTH]
                  [--target-min-length TARGET_MIN_LENGTH]
                  [--source-vocab-size SOURCE_VOCAB_SIZE]
                  [--target-vocab-size TARGET_VOCAB_SIZE]
                  [--source-vocab-min-frequency SOURCE_VOCAB_MIN_FREQUENCY]
                  [--target-vocab-min-frequency TARGET_VOCAB_MIN_FREQUENCY]
                  [--keep-rare-words-with-embeddings [KEEP_RARE_WORDS_WITH_EMBEDDINGS]]
                  [--add-embeddings-vocab [ADD_EMBEDDINGS_VOCAB]]
                  [--embeddings-format {polyglot,word2vec,fasttext,glove,text}]
                  [--embeddings-binary [EMBEDDINGS_BINARY]]
                  [--source-embeddings SOURCE_EMBEDDINGS]
                  [--target-embeddings TARGET_EMBEDDINGS]
                  [--bad-weight BAD_WEIGHT] [--window-size WINDOW_SIZE]
                  [--max-aligned MAX_ALIGNED]
                  [--source-embeddings-size SOURCE_EMBEDDINGS_SIZE]
                  [--target-embeddings-size TARGET_EMBEDDINGS_SIZE]
                  [--freeze-embeddings [FREEZE_EMBEDDINGS]]
                  [--embeddings-dropout EMBEDDINGS_DROPOUT]
                  [--hidden-sizes HIDDEN_SIZES [HIDDEN_SIZES ...]]
                  [--dropout DROPOUT]
                  [--init-type {uniform,normal,constant,glorot_uniform,glorot_normal}]
                  [--init-support INIT_SUPPORT]

data

--train-source Path to training source file
--train-target Path to training target file
--train-alignments
 Path to train alignments between source and target.
--train-source-tags
 Path to training label file for source (WMT18 format)
--train-target-tags
 Path to training label file for target
--valid-source Path to validation source file
--valid-target Path to validation target file
--valid-alignments
 Path to valid alignments between source and target.
--valid-source-tags
 Path to validation label file for source (WMT18 format)
--valid-target-tags
 Path to validation label file for target
--valid-source-pos
 Path to training PoS tags file for source
--valid-target-pos
 Path to training PoS tags file for target

data processing options

--predict-target
 

Predict Target Tags. Leave unchanged for WMT17 format

Default: True

--predict-gaps

Predict Gap Tags.

Default: False

--predict-source
 

Predict Source Tags.

Default: False

--wmt18-format

Read target tags in WMT18 format.

Default: False

--source-max-length
 

Maximum source sequence length

Default: inf

--source-min-length
 

Truncate source sequence length.

Default: 1

--target-max-length
 

Maximum target sequence length to keep.

Default: inf

--target-min-length
 

Truncate target sequence length.

Default: 1

vocabulary options

--source-vocab-size
 Size of the source vocabulary.
--target-vocab-size
 Size of the target vocabulary.
--source-vocab-min-frequency
 

Min word frequency for source vocabulary.

Default: 1

--target-vocab-min-frequency
 

Min word frequency for target vocabulary.

Default: 1

--keep-rare-words-with-embeddings
 

Keep words that occur less then min-frequency but are in embeddings vocabulary.

Default: False

--add-embeddings-vocab
 

Add words from embeddings vocabulary to source/target vocabulary.

Default: False

--embeddings-format
 

Possible choices: polyglot, word2vec, fasttext, glove, text

Word embeddings format. See README for specific formatting instructions.

Default: “polyglot”

--embeddings-binary
 

Load embeddings stored in binary.

Default: False

--source-embeddings
 Path to word embeddings file for source.
--target-embeddings
 Path to word embeddings file for target.

hyper-parameters

--bad-weight

Relative weight for bad labels.

Default: 3.0

--window-size

Sliding window size.

Default: 3

--max-aligned

Max number of alignments between source and target.

Default: 5

--source-embeddings-size
 

Word embedding size for source.

Default: 50

--target-embeddings-size
 

Word embedding size for target.

Default: 50

--freeze-embeddings
 

Freeze embedding weights during training.

Default: False

--embeddings-dropout
 

Dropout rate for embedding layers.

Default: 0.0

--hidden-sizes

List of hidden sizes.

Default: [400, 200, 100, 50]

--dropout

Dropout rate for linear layers.

Default: 0.0

--init-type

Possible choices: uniform, normal, constant, glorot_uniform, glorot_normal

Distribution type for parameters initialization.

Default: “uniform”

--init-support

Parameters are initialized over uniform distribution with support (-param_init, param_init). Use 0 to not use initialization.

Default: 0.1