Train Interface

Note: Args that start with ‘--’ (eg. --save-config) can also be set in a config file (specified via --config). The config file uses YAML syntax and must represent a YAML ‘mapping’ (for details, see http://learn.getgrav.org/advanced/yaml). If an arg is specified in more than one place, then command line values override config file values which override defaults.

For generic options see: here

Model specific options can be seen here:

Usage examples of the python api.

Run:

import kiwi

train_predest_config = 'experiments/examples/train_predest.yaml'
kiwi.train(train_predest_config)

train_nuqe_config = 'experiments/examples/train_nuqe.yaml'
kiwi.train(train_nuqe_config)

Usage example of the cli api.

Run:

kiwi train --config {model_config_file} [OPTS]
usage: kiwi train [-h] [--epochs EPOCHS] [--train-batch-size TRAIN_BATCH_SIZE]
                  [--valid-batch-size VALID_BATCH_SIZE]
                  [--optimizer {sgd,adagrad,adadelta,adam,sparseadam}]
                  [--learning-rate LEARNING_RATE]
                  [--learning-rate-decay LEARNING_RATE_DECAY]
                  [--learning-rate-decay-start LEARNING_RATE_DECAY_START]
                  [--checkpoint-validation-steps CHECKPOINT_VALIDATION_STEPS]
                  [--checkpoint-save [CHECKPOINT_SAVE]]
                  [--checkpoint-keep-only-best CHECKPOINT_KEEP_ONLY_BEST]
                  [--checkpoint-early-stop-patience CHECKPOINT_EARLY_STOP_PATIENCE]
                  [--resume [RESUME]]

training

--epochs

Number of epochs for training.

Default: 50

--train-batch-size
 

Maximum batch size for training.

Default: 64

--valid-batch-size
 

Maximum batch size for evaluating.

Default: 64

training-optimization

--optimizer

Possible choices: sgd, adagrad, adadelta, adam, sparseadam

Optimization method.

Default: “adam”

--learning-rate
 

Starting learning rate. Recommended settings: sgd = 1, adagrad = 0.1, adadelta = 1, adam = 0.001

Default: 1.0

--learning-rate-decay
 

Decay learning rate by this factor.

Default: 1.0

--learning-rate-decay-start
 

Start decay after this epoch.

Default: 0

training-save-load

--checkpoint-validation-steps
 

Perform validation every X training batches. Saves model if checkpoint-save is true.

Default: 0

--checkpoint-save
 

Save a training snapshot when validation is run. If false it will never save the model.

Default: True

--checkpoint-keep-only-best
 

Keep only n best models according to main metric (F1Mult by default); 0 will keep all.

Default: 1

--checkpoint-early-stop-patience
 

Stop training if evaluation metrics do not improve after X validations; 0 disables this.

Default: 0

--resume

Resume training a previous run. If –output-dir is not none, Kiwi will load from a checkpoint folder in that location. If –output-dir is not specified, then the –run-uuid option must be set. Files are then searched under the “runs” directory. If not found, they are downloaded from the MLflow server (check the –mlflow-tracking-uri option).

Default: False