5,337
edits
Line 1,058: | Line 1,058: | ||
===Pixel-RNN/CNN=== | ===Pixel-RNN/CNN=== | ||
* Fully-visible belief network | * Fully-visible belief network | ||
* Each pixel depends on | * Explicit density model: | ||
** Each pixel depends on all previous pixels | |||
** <math>P_{\theta}(x) = \prod_{i=1}^{n} P_{\theta}(x_i | x_1, ..., x_{i-1})</math> | ** <math>P_{\theta}(x) = \prod_{i=1}^{n} P_{\theta}(x_i | x_1, ..., x_{i-1})</math> | ||
** You need to define what is ''previous pixels'' (e.g. all pixels above and left) | |||
* Then maximize likelihood of training data | |||
;Pros: | ;Pros: | ||
Line 1,070: | Line 1,071: | ||
* Sequence generation is slow | * Sequence generation is slow | ||
* Optimizing P(x) is hard. | * Optimizing P(x) is hard. | ||
Types of ''previous pixels'' connections: | |||
* PixelCNN looks at all previous pixels (fastest) | |||
* Row LSTM has a triangular receptive field (slow) | |||
* Diagonal LSTM | |||
* Diagonal BiLSTM has a full dependency field (slowest) | |||
;Multi-scale PixelRNN | |||
* Takes subsampled pixels as additional input pixels | |||
* Can capture better global information | |||
* Slightly better results | |||
===Generative Adversarial Networks (GANs)=== | |||
* Generator generates images | |||
* Discriminator classifies real or fake | |||
* Loss: <math>\min_{G} \max_{D} E_x[\log D(x)] + E_z[\log(1-D(G(z)))]</math> | |||
;Image-to-image Conditional GANS | |||
* Add an image encoder which outputs z | |||
;pix2pix | |||
* Add L1 loss to the loss function | |||
* UNet generator | |||
* PatchGAN discriminator | |||
** PatchGAN outputs N*N values with real-fake with each patch (i.e. limited receptive field) | |||
* Requires paired samples | |||
;CycleGAN | |||
* Unpaired image-to-image translation | |||
* Cycle-consistency loss | |||
==Will be on the exam== | ==Will be on the exam== |