5,322
edits
Line 34: | Line 34: | ||
;Encoder-decoder attention | ;Encoder-decoder attention | ||
The decoder also uses encoder-decoder attention where the keys and values | The decoder also uses encoder-decoder attention where the keys and values are generated from the output embedding of the encoder (i.e. the input sentence) using its own weight matrices but the queries are generated from the decoder input (i.e. the previously-generated output). | ||
===Encoder=== | ===Encoder=== |