kiwi.modules.common.scalar_mix
ScalarMixWithDropout
Compute a parameterised scalar mixture of N tensors.
kiwi.modules.common.scalar_mix.
Bases: torch.nn.Module
torch.nn.Module
\(mixture = \gamma * \sum(s_k * tensor_k)\), where \(s = softmax(w)\), with \(w\) and \(gamma\) scalar parameters.
If do_layer_norm=True, then apply layer normalization to each tensor before weighting.
do_layer_norm=True
If dropout > 0, then for each scalar weight, adjust its softmax weight mass to 0 with the dropout probability (i.e., setting the unnormalized weight to -inf). This effectively should redistribute dropped probability mass to all other weights.
dropout > 0
https://github.com/Hyperparticle/udify
https://gitlab.com/Unbabel/language-technologies/unbabel-comet
forward
Compute a weighted average of the ‘tensors’.
The input tensors can be any shape with at least two dimensions, but must all have the same shape.
When do_layer_norm=True, mask is required. If tensors have dimensions (dim_0, ..., dim_{n-1}, dim_n), then mask should have dims (dim_0, ..., dim_{n-1}), as in the typical case with tensors of shape (batch_size, timesteps, dim) and mask of shape (batch_size, timesteps).
mask
tensors
(dim_0, ..., dim_{n-1}, dim_n)
(dim_0, ..., dim_{n-1})
(batch_size, timesteps, dim)
(batch_size, timesteps)
kiwi.modules.common.positional_encoding
kiwi.modules.common.scorer