5,321
edits
(Add LLaMa) |
(→L) |
||
Line 55: | Line 55: | ||
* Latent code - an input to a neural network which gets optimized during training to represent a particular state or value | * Latent code - an input to a neural network which gets optimized during training to represent a particular state or value | ||
* [https://ai.facebook.com/blog/large-language-model-llama-meta-ai/ LLaMa] - a set pf language models published in 2023 by Meta. It is trained on more tokens and some models can run on a single GPU or CPU. | * [https://ai.facebook.com/blog/large-language-model-llama-meta-ai/ LLaMa] - a set pf language models published in 2023 by Meta. It is trained on more tokens and some models can run on a single GPU or CPU. | ||
* [[Long short-term memory]] (LSTM) - An RNN neural network architecture which has two sets of hidden states for long and short term. | |||
* Loss function - Target function which you are attempting to minimize. | * Loss function - Target function which you are attempting to minimize. | ||
* [ | * [https://arxiv.org/abs/2106.09685 Low-rank Adaptation (LORA)] - A technique to fine-tune a model which uses low-rank matrices as diffs to the original model. | ||
==M== | ==M== |