5,321
edits
(→G) |
(Add LLaMa) |
||
Line 54: | Line 54: | ||
* Large Language Model (LLM) - A neural network, typically a transformer (see GPT), which is trained to understand language. | * Large Language Model (LLM) - A neural network, typically a transformer (see GPT), which is trained to understand language. | ||
* Latent code - an input to a neural network which gets optimized during training to represent a particular state or value | * Latent code - an input to a neural network which gets optimized during training to represent a particular state or value | ||
* [https://ai.facebook.com/blog/large-language-model-llama-meta-ai/ LLaMa] - a set pf language models published in 2023 by Meta. It is trained on more tokens and some models can run on a single GPU or CPU. | |||
* Loss function - Target function which you are attempting to minimize. | * Loss function - Target function which you are attempting to minimize. | ||
* [[Long short-term memory]] (LSTM) - An RNN neural network architecture which has two sets of hidden states for long and short term. | * [[Long short-term memory]] (LSTM) - An RNN neural network architecture which has two sets of hidden states for long and short term. |