Machine Learning Glossary: Difference between revisions
| (10 intermediate revisions by the same user not shown) | |||
| Line 10: | Line 10: | ||
==C== | ==C== | ||
* [[Capsule neural network]] | * [[Capsule neural network]] - a niche type of neural network whose neurons outputs vectors instead of scalars. | ||
* [[Convolutional neural network]] or CNN - A neural network architecture for image data, or other data on a regular grid. | * [[Convolutional neural network]] or CNN - A neural network architecture for image data, or other data on a regular grid. | ||
* Cross-entropy - A loss function for used for classification with categorical data. | * Cross-entropy - A loss function for used for classification with categorical data. | ||
| Line 29: | Line 29: | ||
==F== | ==F== | ||
* Forward propagation - Inference through a neural network by computing each layer's outputs. | * Forward propagation - Inference through a neural network by computing each layer's outputs. | ||
* Fréchet Inception Distance (FID) - a reference-based | * Fréchet Inception Distance (FID) - a reference-based GAN evaluation metric which passes images through a pretrained network (typically Inception) and compares the distribution of intermediate features. | ||
* Fully connected network - The standard neural network model where each layer is a sequence of nodes. | * Fully connected network - The standard neural network model where each layer is a sequence of nodes. | ||
* Features | * Features - a generic term indicating the latent inputs or intermediate outputs of a neural network (2D = feature map, 3D = feature grid). | ||
==G== | ==G== | ||
* Generalization - How well a model works on data it has not been trained on. | * Generalization - How well a model works on data it has not been trained on. | ||
* [[Generative adversarial network]] (GAN) - A neural network setup for generating examples from a training distribution. | * [[Generative adversarial network]] (GAN) - A neural network setup for generating examples from a training distribution. | ||
* Generative Pretrained Transformer (GPT) - A large decoder-only transformer trained on next-word prediction. | * Generative Pretrained Transformer (GPT) - A large decoder-only transformer trained on next-word prediction. GPT-2, GPT-3 refers specific models owned by OpenAI. | ||
* Gradient Descent - The operation used to update parameters when optimizing neural network. Also known as direction of steepest descent. | * Gradient Descent - The operation used to update parameters when optimizing neural network. Also known as direction of steepest descent. | ||
* [[Graph neural network]] (GNN) - A type of neural network which operates on graph inputs. | * [[Graph neural network]] (GNN) - A type of neural network which operates on graph inputs. | ||
| Line 52: | Line 52: | ||
==L== | ==L== | ||
* L1, L2, Lp norm - Two common norms used for computing accuracy or losses. Lp is the general form where the exponent is ''p''. | * L1, L2, Lp norm - Two common norms used for computing accuracy or losses. Lp is the general form where the exponent is ''p''. | ||
* Large Language Model (LLM) - A neural network, typically a transformer (see GPT), which is trained to understand language. | |||
* Latent code - an input to a neural network which gets optimized during training to represent a particular state or value | * Latent code - an input to a neural network which gets optimized during training to represent a particular state or value | ||
* [https://ai.facebook.com/blog/large-language-model-llama-meta-ai/ LLaMa] - a set pf language models published in 2023 by Meta. It is trained on more tokens and some models can run on a single GPU or CPU. | |||
* [[Long short-term memory]] (LSTM) - An RNN neural network architecture which has two sets of hidden states for long and short term. | |||
* Loss function - Target function which you are attempting to minimize. | * Loss function - Target function which you are attempting to minimize. | ||
* [ | * [https://arxiv.org/abs/2106.09685 Low-rank Adaptation (LORA)] - A technique to fine-tune a model which uses low-rank matrices as diffs to the original model. | ||
==M== | ==M== | ||
| Line 62: | Line 65: | ||
==N== | ==N== | ||
* Neurons - Individual elements in a MLP layer (perceptron + activation) which supposedly resemble brain neurons. | * Neurons - Individual elements in a MLP layer (perceptron + activation) which supposedly resemble brain neurons. | ||
* Neural Fields - A subfield of computer vision and graphics which uses neural networks to represent 2D/3D scenes and perform tasks such as 3D reconstruction, scene generation, and image compression. | |||
* Normalized Device Coordinates - In images, pixels are in coordinates of <math>[-1, 1]\times[-1, 1] </math>. | * Normalized Device Coordinates - In images, pixels are in coordinates of <math>[-1, 1]\times[-1, 1] </math>. | ||
| Line 71: | Line 75: | ||
* Perceptron - a linear classifier. | * Perceptron - a linear classifier. | ||
* Perceptual loss - a loss function which passes images through a pretrained network (e.g. VGG) and compares intermediate features instead of raw pixels. | * Perceptual loss - a loss function which passes images through a pretrained network (e.g. VGG) and compares intermediate features instead of raw pixels. | ||
* Positional encoding - Applying sin/cos at various frequencies (i.e. fourier basis) so the network can distinguish input values at different scales. Used in | * Positional encoding - Applying sin/cos at various frequencies (i.e. fourier basis) so the network can distinguish input values at different scales. Used in neural fields as well as NLP models to encode the relative positions of inputs. | ||
==R== | ==R== | ||