`kiwi.lib.search`¶

Module Contents¶

Classes¶

`RangeConfig`	Specify a continuous interval, or a discrete range when step is set.
`ClassWeightsConfig`	Specify the range to search in for the tag loss weights.
`SearchOptions`	Base class for all pydantic configs. Used to configure base behaviour of configs.
`Configuration`	Base class for all pydantic configs. Used to configure base behaviour of configs.
`Objective`	The objective to be optimized by the Optuna hyperparameter search.

Functions¶

`search_from_file`(filename: Path)	Load options from a config file and calls the training procedure.
`search_from_configuration`(configuration_dict: dict)	Run the entire training pipeline using the configuration options received.
`get_suggestion`(trial, param_name: str, config: Union[List, RangeConfig]) → Union[bool, float, int]	Let the Optuna trial suggest a parameter value with name `param_name`
`setup_run`(directory: Path, seed: int, debug=False, quiet=False) → Path	Set up the output directory structure for the Optuna search outputs.
`run`(config: Configuration)	Run hyperparameter search according to the search configuration.

kiwi.lib.search.logger¶

class kiwi.lib.search.RangeConfig¶

Bases: kiwi.utils.io.BaseConfig

Specify a continuous interval, or a discrete range when step is set.

lower :float¶: The lower bound of the search range.

upper :float¶: The upper bound of the search range.

step :Optional[float]¶: Specify a step size to create a discrete range of search values.

distribution :Literal['uniform', 'loguniform'] = uniform¶: Specify the distribution over the search range. Uniform is recommended for all hyperparameters except for the learning rate, for which loguniform is recommended. Only works when a RangeConfig without step is specified.

class kiwi.lib.search.ClassWeightsConfig¶

Bases: kiwi.utils.io.BaseConfig

Specify the range to search in for the tag loss weights.

target_tags :Union[None, List[float], RangeConfig]¶: Loss weight for the target tags.

gap_tags :Union[None, List[float], RangeConfig]¶: Loss weight for the gap tags.

source_tags :Union[None, List[float], RangeConfig]¶: Loss weight for the source tags.

class kiwi.lib.search.SearchOptions¶

Bases: kiwi.utils.io.BaseConfig

Base class for all pydantic configs. Used to configure base behaviour of configs.

patience :int = 10¶: Number of training validations without improvement to wait before stopping training.

validation_steps :float = 0.2¶: Rely on the Kiwi training options to early stop bad models.

search_mlp :bool = False¶: To use or not to use an MLP after the encoder.

search_word_level :bool = False¶: Try with and without word level output. Useful to figure out if word level prediction is helping HTER regression performance.

search_hter :bool = False¶: Try with and without sentence level output. Useful to figure out if HTER regression is helping word level performance.

learning_rate :Union[None, List[float], RangeConfig]¶: Search the learning rate value.

dropout :Union[None, List[float], RangeConfig]¶: Search the dropout rate used in the decoder.

warmup_steps :Union[None, List[float], RangeConfig]¶: Search the number of steps to warm up the learning rate.

freeze_epochs :Union[None, List[float], RangeConfig]¶: Search the number of epochs to freeze the encoder.

class_weights :Union[None, ClassWeightsConfig]¶: Search the word-level tag loss weights.

sentence_loss_weight :Union[None, List[float], RangeConfig]¶: Search the weight to scale the sentence loss objective with.

hidden_size :Union[None, List[int], RangeConfig]¶: Search the hidden size of the MLP decoder.

bottleneck_size :Union[None, List[int], RangeConfig]¶: Search the size of the hidden layer in the decoder bottleneck.

search_method :Literal['random', 'tpe', 'multivariate_tpe'] = multivariate_tpe¶: Use random search or the (multivariate) Tree-structured Parzen Estimator, or shorthand: TPE. See optuna.samplers for more details about these methods.

check_consistency(cls, v, values)¶

class kiwi.lib.search.Configuration¶

Bases: kiwi.utils.io.BaseConfig

Base class for all pydantic configs. Used to configure base behaviour of configs.

base_config :Union[FilePath, train.Configuration]¶: Kiwi train configuration used as a base to configure the search models. Can be a path or a yaml configuration properly indented under this argument.

directory :Path¶: Output directory.

seed :int = 42¶: Make the search reproducible.

search_name :str¶: The name used by the Optuna MLflow integration. If None, Optuna will create a unique hashed name.

num_trials :int = 50¶: The number of search trials to run.

num_models_to_keep :int = 5¶: The number of model checkpoints that are kept after finishing search. The best checkpoints are kept, the others removed to free up space. Keep all model checkpoints by setting this to -1.

options :SearchOptions¶: Configure the search method and parameter ranges.

load_study :FilePath¶: Continue from a previous saved study, i.e. from a study.pkl file.

verbose :bool = False¶

quiet :bool = False¶

parse_base_config(cls, v)¶

kiwi.lib.search.search_from_file(filename: Path)¶

Load options from a config file and calls the training procedure.

Parameters: filename – of the configuration file.
Returns: an object with training information.

kiwi.lib.search.search_from_configuration(configuration_dict: dict)¶

Run the entire training pipeline using the configuration options received.

Parameters: configuration_dict – dictionary with options.
Returns: object with training information.

kiwi.lib.search.get_suggestion(trial, param_name: str, config: Union[List, RangeConfig]) → Union[bool, float, int]¶

Let the Optuna trial suggest a parameter value with name param_name based on the range configuration.

Parameters

trial – an Optuna trial
param_name (str) – the name of the parameter to suggest a value for
config (Union[List, RangeConfig]) – the parameter search space

Returns

The suggested parameter value.

kiwi.lib.search.setup_run(directory: Path, seed: int, debug=False, quiet=False) → Path¶: Set up the output directory structure for the Optuna search outputs.

class kiwi.lib.search.Objective(config: Configuration, base_config_dict: dict)¶

The objective to be optimized by the Optuna hyperparameter search.

The call method initializes a Kiwi training config based on Optuna parameter suggestions, trains Kiwi, and then returns the output.

The model paths of the models are saved internally together with the objective value obtained for that model. These can be used to prune model checkpoints after completion of the search.

Parameters

config (Configuration) – the search configuration.
base_config_dict (dict) – the training configuration to serve as base, in dictionary form.

property main_metric(self) → str ¶

The main validation metric as it is formatted by the Kiwi trainer.

This can be used to access the main metric value after training via train_info.best_metrics[objective.main_metric].

property num_train_lines(self) → int ¶: The number of lines in the training data.

property updates_per_epochs(self) → int ¶: The number of parameter updates per epochs.

property best_model_paths(self) → List[Path]¶: Return the model paths sorted from high to low by their objective score.

property best_train_configs(self) → List[train.Configuration]¶: Return the train configs sorted from high to low by their objective score.

prune_models(self, num_models_to_keep) → None ¶: Keep only the best model checkpoints and remove the rest to free up space.

suggest_train_config(self, trial) → Tuple[train.Configuration, dict]¶

Use the trial to suggest values to initialize a training configuration.

Parameters: trial – An Optuna trial to make hyperparameter suggestions.
Returns: A Kiwi train configuration and a dictionary with the suggested Optuna parameter names and values that were set in the train config.

__call__(self, trial) → float ¶

Train Kiwi with the hyperparameter values suggested by the trial and return the value of the main metric.

Parameters: trial – An Optuna trial to make hyperparameter suggestions.
Returns: A float with the value obtained by the Kiwi model, as measured by the main metric configured for the model.

kiwi.lib.search.run(config: Configuration)¶

Run hyperparameter search according to the search configuration.

Parameters: config (Configuration) – search configuration
Returns: an optuna study summarizing the search results

kiwi.lib.pretrain kiwi.lib.train

kiwi.lib.search¶

Module Contents¶

Classes¶

Functions¶

`kiwi.lib.search`¶