kiwi.modules.common.attention
Attention
Generic Attention Implementation.
kiwi.modules.common.attention.
Bases: torch.nn.Module
torch.nn.Module
Use query and keys to compute scores (energies) Apply softmax to get attention probabilities Perform a dot product between values and probabilites (outputs)
Use query and keys to compute scores (energies)
Apply softmax to get attention probabilities
Perform a dot product between values and probabilites (outputs)
scorer (kiwi.modules.common.Scorer) – a scorer object
dropout (float) – dropout rate after softmax (default: 0.)
forward
Compute the attention between query, keys and values.
query (torch.Tensor) – set of query vectors with shape of (batch_size, …, target_len, hidden_size)
keys (torch.Tensor) – set of keys vectors with shape of (batch_size, …, source_len, hidden_size)
values (torch.Tensor, optional) – set of values vectors with shape of: (batch_size, …, source_len, hidden_size). If None, keys are treated as values. Default: None
mask (torch.ByteTensor, optional) – Tensor representing valid positions. If None, all positions are considered valid. Shape of (batch_size, target_len)
Shape of (batch_size, …, target_len, hidden_size)
Shape of (batch_size, …, target_len, source_len)
torch.Tensor
kiwi.modules.common.activations
kiwi.modules.common.distributions