Diffusion Models: Difference between revisions

(3 intermediate revisions by the same user not shown)

Line 38:

DALL-E 2 adds a ''prior'' model which first converts a text embedding to a CLIP image embedding.

Then the diffusion ''decoder'' generates an image based on the image embedding.

==Guided Diffusion==

Guidance is a method used to push the diffusion process towards the input condition, e.g. the text input.<br>

There are two types of guidance: classifier guidance and classifier-free guidance.<br>

See [https://benanne.github.io/2022/05/26/guidance.html https://benanne.github.io/2022/05/26/guidance.html].

Classifier guidance uses an image classifier (e.g. clip) to update the noisy input images towards the desired class.<br>

Classifier-free guidance<ref name="ho2021classifierfree"/> performs inference on the diffusion model to predict the noise with and without the class input, and extrapolating away from the output without noise.

==Inversion==

Line 46:

Line 54:

==Resources==

* [https://ai.googleblog.com/2021/07/high-fidelity-image-generation-using.html Google AI Blog High Fidelity Image Generation Using Diffusion Models] - discusses SR3 and CDM

* https://theaisummer.com/diffusion-models/

==References==

{{reflist|refs=

<ref name="ho2021classifierfree">Ho, J., & Salimans, T. (2022). Classifier-Free Diffusion Guidance. doi:10.48550/ARXIV.2207.12598 https://arxiv.org/abs/2207.12598</ref>

}}

@@ Line 38: / Line 38: @@
 DALL-E 2 adds a ''prior'' model which first converts a text embedding to a CLIP image embedding.
 Then the diffusion ''decoder'' generates an image based on the image embedding.
+==Guided Diffusion==
+Guidance is a method used to push the diffusion process towards the input condition, e.g. the text input.<br>
+There are two types of guidance: classifier guidance and classifier-free guidance.<br>
+See [https://benanne.github.io/2022/05/26/guidance.html https://benanne.github.io/2022/05/26/guidance.html].
+Classifier guidance uses an image classifier (e.g. clip) to update the noisy input images towards the desired class.<br>
+Classifier-free guidance<ref name="ho2021classifierfree"/> performs inference on the diffusion model to predict the noise with and without the class input, and extrapolating away from the output without noise.
 ==Inversion==
@@ Line 46: / Line 54: @@
 ==Resources==
 * [https://ai.googleblog.com/2021/07/high-fidelity-image-generation-using.html Google AI Blog High Fidelity Image Generation Using Diffusion Models] - discusses SR3 and CDM
+* https://theaisummer.com/diffusion-models/
+==References==
+{{reflist|refs=
+<ref name="ho2021classifierfree">Ho, J., & Salimans, T. (2022). Classifier-Free Diffusion Guidance. doi:10.48550/ARXIV.2207.12598 https://arxiv.org/abs/2207.12598</ref>
+}}