Diffusion Models: Difference between revisions
| Line 34: | Line 34: | ||
OpenAI have unveiled two text-to-image models, [https://github.com/openai/glide-text2im GLIDE] and [https://openai.com/dall-e-2/ DALL-E 2], which rely on diffusion models to generate images.<br> | OpenAI have unveiled two text-to-image models, [https://github.com/openai/glide-text2im GLIDE] and [https://openai.com/dall-e-2/ DALL-E 2], which rely on diffusion models to generate images.<br> | ||
GLIDE has some open-source code which allows you to test a small version. | GLIDE has some open-source code which allows you to test a small version. | ||
At a high-level, GLIDE is a diffusion model which is conditioned on text embeddings and trained with a technique called classifier-free guidance.<br> | |||
DALL-E 2 adds a ''prior'' model which first converts a text embedding to a CLIP image embedding. | |||
Then the diffusion ''decoder'' generates an image based on the image embedding. | |||
==Resources== | ==Resources== | ||
* [https://ai.googleblog.com/2021/07/high-fidelity-image-generation-using.html Google AI Blog High Fidelity Image Generation Using Diffusion Models] - discusses SR3 and CDM | * [https://ai.googleblog.com/2021/07/high-fidelity-image-generation-using.html Google AI Blog High Fidelity Image Generation Using Diffusion Models] - discusses SR3 and CDM | ||