`kiwi.lib.train`¶

Module Contents¶

Classes¶

`TrainRunInfo`	Encapsulate relevant information on training runs.
`RunConfig`	Options for each run.
`CheckpointsConfig`	Base class for all pydantic configs. Used to configure base behaviour of configs.
`GPUConfig`	Base class for all pydantic configs. Used to configure base behaviour of configs.
`TrainerConfig`	Base class for all pydantic configs. Used to configure base behaviour of configs.
`Configuration`	Base class for all pydantic configs. Used to configure base behaviour of configs.

Functions¶

`train_from_file`(filename) → TrainRunInfo	Load options from a config file and calls the training procedure.
`train_from_configuration`(configuration_dict) → TrainRunInfo	Run the entire training pipeline using the configuration options received.
`setup_run`(config: RunConfig, debug=False, quiet=False, anchor_dir: Path = None) → Tuple[Path, Optional[MLFlowTrackingLogger]]	Prepare for running the training pipeline.
`run`(config: Configuration, system_type: Union[Type[TLMSystem], Type[QESystem]] = QESystem) → TrainRunInfo	Instantiate the system according to the configuration and train it.

kiwi.lib.train.logger¶

class kiwi.lib.train.TrainRunInfo¶

Encapsulate relevant information on training runs.

model :QESystem¶: The last model when training finished.

best_metrics :Dict[str, Any]¶: Mapping of metrics of the best model.

best_model_path :Optional[Path]¶: Path of the best model, if it was saved to disk.

class kiwi.lib.train.RunConfig¶

Bases: kiwi.utils.io.BaseConfig

Options for each run.

seed :int = 42¶: Random seed

experiment_name :str = default¶: If using MLflow, it will log this run under this experiment name, which appears as a separate section in the UI. It will also be used in some messages and files.

output_dir :Path¶: Output several files for this run under this directory. If not specified, a directory under “./runs/” is created or reused based on the run_id. Files might also be sent to MLflow depending on the mlflow_always_log_artifacts option.

run_id :str¶: If specified, MLflow/Default Logger will log metrics and params under this ID. If it exists, the run status will change to running. This ID is also used for creating this run’s output directory if output_dir is not specified (Run ID must be a 32-character hex string).

use_mlflow :bool = False¶: Whether to use MLflow for tracking this run. If not installed, a message is shown

mlflow_tracking_uri :str = mlruns/¶: If using MLflow, logs model parameters, training metrics, and artifacts (files) to this MLflow server. Uses the localhost by default.

mlflow_always_log_artifacts :bool = False¶: If using MLFlow, always log (send) artifacts (files) to MLflow artifacts URI. By default (false), artifacts are only logged if MLflow is a remote server (as specified by –mlflow-tracking-uri option).All generated files are always saved in –output-dir, so it might be considered redundant to copy them to a local MLflow server. If this is not the case, set this option to true.

class kiwi.lib.train.CheckpointsConfig¶

Bases: kiwi.utils.io.BaseConfig

Base class for all pydantic configs. Used to configure base behaviour of configs.

validation_steps :Union[confloat(gt=0.0, le=1.0), PositiveInt] = 1.0¶: How often within one training epoch to check the validation set. If float, % of training epoch. If int, check every n batches.

save_top_k :int = 1¶: Save and keep only k best models according to main metric; -1 will keep all; 0 will never save a model.

early_stop_patience :conint(ge=0) = 0¶: Stop training if evaluation metrics do not improve after X validations; 0 disables this.

class kiwi.lib.train.GPUConfig¶

Bases: kiwi.utils.io.BaseConfig

Base class for all pydantic configs. Used to configure base behaviour of configs.

gpus :Union[int, List[int]] = 0¶: Use the number of GPUs specified if int, where 0 is no GPU. -1 is all GPUs. Alternatively, if a list, uses the GPU-ids specified (e.g., [0, 2]).

precision :Literal[16, 32] = 32¶: The floating point precision to be used while training the model. Available options are 32 or 16 bits.

amp_level :Literal['O0', 'O1', 'O2', 'O3'] = O0¶

The automatic-mixed-precision level to use. O0 is FP32 training. 01 is mixed precision training as popularized by NVIDIA Apex. O2 casts the model weights to FP16

but keeps certain master weights and batch norm in FP32 without patching Torch

functions. 03 is full FP16 training.

setup_gpu_ids(cls, v)¶: If asking to use CPU, let it be, outputting a warning if GPUs are available. If asking to use any GPU but none are available, fall back to CPU and warn user.

setup_amp_level(cls, v, values)¶: If precision is set to 16, amp_level needs to be greater than O0. Following the same logic, if amp_level is set to greater than O0, precision needs to be set to 16.

class kiwi.lib.train.TrainerConfig¶

Bases: kiwi.lib.train.GPUConfig

Base class for all pydantic configs. Used to configure base behaviour of configs.

resume :bool = False¶: Resume training a previous run. The run.run_id (and possibly run.experiment_name) option must be specified. Files are then searched under the “runs” directory. If not found, they are downloaded from the MLflow server (check the mlflow_tracking_uri option).

epochs :int = 50¶: Number of epochs for training.

gradient_accumulation_steps :int = 1¶: Accumulate gradients for the given number of steps (batches) before back-propagating.

gradient_max_norm :float = 0.0¶: Clip gradients with norm above this value; by default (0.0), do not clip.

main_metric :Union[str, List[str]]¶: Choose Primary Metric for this run.

log_interval :int = 100¶: Log every k batches.

log_save_interval :int = 100¶: Save accumulated log every k batches (does not seem to matter to MLflow logging).

checkpoint :CheckpointsConfig¶

deterministic :bool = True¶: If true enables cudnn.deterministic. Might make training slower, but ensures reproducibility.

class kiwi.lib.train.Configuration¶

Bases: kiwi.utils.io.BaseConfig

Base class for all pydantic configs. Used to configure base behaviour of configs.

run :RunConfig¶: Options specific to each run

trainer :TrainerConfig¶

data :WMTQEDataset.Config¶

system :QESystem.Config¶

debug :bool = False¶: Run training in fast_dev mode; only one batch is used for training and validation. This is useful to test out new models.

verbose :bool = False¶

quiet :bool = False¶

kiwi.lib.train.train_from_file(filename) → TrainRunInfo ¶

Load options from a config file and calls the training procedure.

Parameters: filename – of the configuration file.
Returns: an object with training information.

kiwi.lib.train.train_from_configuration(configuration_dict) → TrainRunInfo ¶

Run the entire training pipeline using the configuration options received.

Parameters: configuration_dict – dictionary with options.

Return: object with training information.

kiwi.lib.train.setup_run(config: RunConfig, debug=False, quiet=False, anchor_dir: Path = None) → Tuple[Path, Optional[MLFlowTrackingLogger]]¶

Prepare for running the training pipeline.

This includes setting up the output directory, random seeds, and loggers.

Parameters

config – configuration options.
quiet – whether to suppress info log messages.
debug – whether to additionally log debug messages (:param:`quiet` has precedence)
anchor_dir – directory to use as root for paths.

Returns

a tuple with the resolved path to the output directory and the experiment logger (None if not configured).

kiwi.lib.train.run(config: Configuration, system_type: Union[Type[TLMSystem], Type[QESystem]] = QESystem) → TrainRunInfo ¶

Instantiate the system according to the configuration and train it.

Load or create a trainer for doing it.

Parameters

config – generic training options.
system_type – class of system being used.

Returns

an object with training information.

kiwi.lib.search kiwi.lib.utils

kiwi.lib.train¶

Module Contents¶

Classes¶

Functions¶

`kiwi.lib.train`¶