Jump to content

Machine Learning Glossary: Difference between revisions

Line 3: Line 3:
==A==
==A==
* Attention - An component of [[Transformer_(machine_learning_model)|transformers]] which involves computing the product of query and key embeddings to compute the interaction between elements.
* Attention - An component of [[Transformer_(machine_learning_model)|transformers]] which involves computing the product of query and key embeddings to compute the interaction between elements.
* Adam optimizer - A popular gradient descent optimizer which includes momentum and per-parameter learning rates.


==B==
==B==