5,337
edits
Line 1,101: | Line 1,101: | ||
* Unpaired image-to-image translation | * Unpaired image-to-image translation | ||
* Cycle-consistency loss | * Cycle-consistency loss | ||
;BicycleGAN | |||
* First learn to reconstruct images with a nice latent code representation in between (cVAE-GAN) | |||
* The main difference is that we have a many-to-many mapping (multi-modal image-to-image) between the two domains. | |||
;MUNIT | |||
* Multimodal UNsupervised Image-to-image Translation | |||
* Maps images from each domain into a shared content space and domain-specific style space. | |||
===Training Problems with GANs=== | |||
* Instability | |||
* Difficult to keep generator and discriminator in sync. | |||
** Discriminator cannot be too good or too bad. Same with generator. | |||
** Tricks: LR scheduling, keep discriminator small, update generator more frequently. | |||
* Mode collapse | |||
Mode collapse happens when the generator cannot model different parts of the distribution. | |||
;DCGAN architecture guidelines | |||
* Use strided conv instead of pooling for discriminator. | |||
* Use batchnorm in generator and discriminator. | |||
* Remove FC hidden layers. | |||
* Use Relu for hidden layers, tanh for output layers of generator. | |||
* Use LRelu for discriminator. | |||
LSGAN, WGAN have tricks to mitigate mode collapse. | |||
===Evaluation of GANs=== | |||
* Turing test (User study) | |||
* Inception score | |||
===Variational Auto-encoders (VAEs)=== | |||
;Training a VAE | |||
* Data likelihood: <math>P(x) = \int P(x|z) P(z) dz</math> | |||
* Approx with samples of z during training: <math>P(x) \approx \frac{1}{n} \sum_{i=0}^{n} P(x | z_i)</math> | |||
* This is impractical. | |||
Assume we can learn a distribution <math>Q(z)</math> such that <math>z \sim Q(z)</math> generates <math>P(x|z) > 0</math>. | |||
Relating <math>P(x)</math> and <math>E_{z \sim Q(z|x)}</math>? | |||
<math> | |||
\begin{aligned} | |||
D_{KL}[Q(z|x) \Vert P(z|x)] &= E_{z \sim Q}[\log Q(z|x) - \log P(z)] - E_{z \sim Q}[\log P(x|z)] + \log P(x)\\ | |||
&= D_{KL}[Q|P] - E_{z \sim Q}[\log P(x|z)] + \log P(x) | |||
\end{aligned} | |||
</math> | |||
Rearranging we get: | |||
<math> | |||
\begin{aligned} | |||
\log P(x) - D_{KL}[Q(z|x) \Vert P(z|x)] = E_{z \sim Q}[\log P(x|z)] - D_{KL}[Q(z|x) \Vert P(z)]\\ | |||
\implies \log P(x) \geq E_{z \sim Q}[\log P(x|z)] - D_{KL}[Q(z|x) \Vert P(z)] | |||
\end{aligned} | |||
</math> | |||
==Will be on the exam== | ==Will be on the exam== |