`kiwi.lib.predict`¶

Module Contents¶

Classes¶

`RunConfig`	Base class for all pydantic configs. Used to configure base behaviour of configs.
`Configuration`	Base class for all pydantic configs. Used to configure base behaviour of configs.

Functions¶

`load_system`(system_path: Union[str, Path], gpu_id: Optional[int] = None)	Load a pretrained system (model) into a Runner object.
`predict_from_configuration`(configuration_dict: Dict[str, Any])	Run the entire prediction pipeline using the configuration options received.
`run`(config: Configuration, output_dir: Path) → Tuple[Dict[str, List], Optional[MetricsReport]]	Run the prediction pipeline.
`make_predictions`(output_dir: Path, best_model_path: Path, data_partition: Literal['train', 'valid', 'test'], data_config: WMTQEDataset.Config, outputs_config: QEOutputs.Config = None, batch_size: Union[int, BatchSizeConfig] = None, num_workers: int = 0, gpu_id: int = None)	Make predictions over the validation set using the best model created during
`setup_run`(config: RunConfig, quiet=False, debug=False, anchor_dir: Path = None) → Path	Prepare for running the prediction pipeline.

kiwi.lib.predict.logger¶

class kiwi.lib.predict.RunConfig¶

Bases: kiwi.utils.io.BaseConfig

Base class for all pydantic configs. Used to configure base behaviour of configs.

seed :int = 42¶: Random seed

run_id :str¶: If specified, MLflow/Default Logger will log metrics and params under this ID. If it exists, the run status will change to running. This ID is also used for creating this run’s output directory. (Run ID must be a 32-character hex string).

output_dir :Path¶: Output several files for this run under this directory. If not specified, a directory under “runs” is created or reused based on the Run UUID.

predict_on_data_partition :Literal['train', 'valid', 'test'] = test¶: Name of the data partition to predict upon. File names are read from the corresponding data configuration field.

check_consistency(cls, v, values)¶

class kiwi.lib.predict.Configuration¶

Bases: kiwi.utils.io.BaseConfig

Base class for all pydantic configs. Used to configure base behaviour of configs.

run :RunConfig¶

data :WMTQEDataset.Config¶

system :QESystem.Config¶

use_gpu :bool = False¶: If true and only if available, use the CUDA device specified in gpu_id or the first CUDA device. Otherwise, use the CPU.

gpu_id :Optional[int]¶: Use CUDA on the listed device, only if use_gpu is true.

verbose :bool = False¶

quiet :bool = False¶

enforce_loading(cls, v)¶

setup_gpu(cls, v)¶

setup_gpu_id(cls, v, values)¶

kiwi.lib.predict.load_system(system_path: Union[str, Path], gpu_id: Optional[int] = None)¶

Load a pretrained system (model) into a Runner object.

Parameters

system_path – A path to the saved checkpoint file produced by a training run.
gpu_id – id of the gpu to load the model into (-1 or None to use CPU)

Throws:: Exception: If the path does not exist, or is not a valid system file.

kiwi.lib.predict.predict_from_configuration(configuration_dict: Dict[str, Any])¶: Run the entire prediction pipeline using the configuration options received.

kiwi.lib.predict.run(config: Configuration, output_dir: Path) → Tuple[Dict[str, List], Optional[MetricsReport]]¶

Run the prediction pipeline.

Load the model and necessary files and create the model’s predictions for the configured data partition.

Parameters

config – validated configuration values for the (predict) pipeline.
output_dir – directory where to save predictions.

Returns

Dictionary with format {‘target’: predictions}

Return type

Predictions

kiwi.lib.predict.make_predictions(output_dir: Path, best_model_path: Path, data_partition: Literal[‘train’, ‘valid’, ‘test’], data_config: WMTQEDataset.Config, outputs_config: QEOutputs.Config = None, batch_size: Union[int, BatchSizeConfig] = None, num_workers: int = 0, gpu_id: int = None)¶

Make predictions over the validation set using the best model created during training.

Parameters

output_dir – output Directory where predictions should be saved.
best_model_path – path pointing to the checkpoint with best performance.
data_partition – on which dataset to predict (one of ‘train’, ‘valid’, ‘test’).
data_config – configuration containing options for the data_partition set.
outputs_config – configuration specifying which outputs to activate.
batch_size – for predicting.
num_workers – number of parallel data loaders.
gpu_id – GPU to use for predicting; 0 for CPU.

Returns

predictions}.

Return type

dictionary with predictions in the format {‘target’

kiwi.lib.predict.setup_run(config: RunConfig, quiet=False, debug=False, anchor_dir: Path = None) → Path¶

Prepare for running the prediction pipeline.

This includes setting up the output directory, random seeds, and loggers.

Parameters

config – configuration options.
quiet – whether to suppress info log messages.
debug – whether to additionally log debug messages (:param:`quiet` has precedence)
anchor_dir – directory to use as root for paths.

Returns

the resolved path to the output directory.

kiwi.lib.evaluate kiwi.lib.pretrain

kiwi.lib.predict¶

Module Contents¶

Classes¶

Functions¶

`kiwi.lib.predict`¶