Generative adversarial network: Difference between revisions
No edit summary |
|||
Line 54: | Line 54: | ||
** Colorize with GAN. Only transmit luminance (Y of YUV) | ** Colorize with GAN. Only transmit luminance (Y of YUV) | ||
** The paper claims 72% BDBR reduction compared to HM 16.0. | ** The paper claims 72% BDBR reduction compared to HM 16.0. | ||
==Important Papers== | |||
* [https://arxiv.org/abs/1811.10597 GAN Dissection: Visualizing and Understanding Generative Adversarial Networks] | |||
** Basically, each "unit" or channel of the intermediate representations correspond to some features like windows or trees | |||
** Dissection: Identifying which units correspond to features can be done by visualizing each channel as a heatmap. Then threshold the heatmap so each value is binary 0/1. Calculate the IOU between the heatmap and the segmented feature in the generated picture. | |||
** Intervention: By zeroing out channels, you can remove windows or trees from the generated image. Alternatively you can add windows or trees at specific locations by activating the neurons at that location. | |||
** This is fairly specific to CNN architectures where there is a locality correspondence between the intermediate representations and the output image. | |||
==Resources== | ==Resources== | ||
* [https://github.com/soumith/ganhacks Tricks for Training GANs] | * [https://github.com/soumith/ganhacks Tricks for Training GANs] |
Revision as of 18:55, 15 January 2020
GANs are generative adversarial networks. They were developed by Ian Goodfellow.
Goal: Learn to generate examples from the same distribution as your training set.
Basis Structure
GANs consist of a generator and a discriminator.
For iteration i For iteration j Update Discriminator Update Generator
Variations
Conditional GAN
Paper
Feed data y to both generator and discriminator
Wasserstein GAN
Paper
Medium post
This new WGAN-GP loss function improves the stability of training.
Normally, the discriminator is trained with a cross-entropy with sigmoid loss function.
The WGAN proposes using Wasserstein distance which is implemented by removing the cross-entropy+sigmoid
and clipping (clamp) the weights on the discriminator to a range \(\displaystyle [-c, c]\).
However, weight clipping leads to other issues which limit the critic.
Instead of clipping, WGAN-GP proposes gradient penalty to enforce 1-Lipschitz .
Applications
CycleGan
InfoGAN
SinGAN
Paper
Website
Github Official PyTorch Implementation
SinGAN: Learning a Generative Model from a Single Natural Image
MoCoGAN
Paper
MoCoGAN: Decomposing Motion and Content for Video Generation
Video Prediction
- Dual Motion GAN (Liang et al. 2017)
- Have a frame generator and a motion generator
- Combine the outputs of both generators using a fusing layer
- Trained using a frame discriminator and a motion discriminator. (Each generator are trained with both discriminators)
Image and Video Compression
- Video compression via colorization
- Colorize with GAN. Only transmit luminance (Y of YUV)
- The paper claims 72% BDBR reduction compared to HM 16.0.
Important Papers
- GAN Dissection: Visualizing and Understanding Generative Adversarial Networks
- Basically, each "unit" or channel of the intermediate representations correspond to some features like windows or trees
- Dissection: Identifying which units correspond to features can be done by visualizing each channel as a heatmap. Then threshold the heatmap so each value is binary 0/1. Calculate the IOU between the heatmap and the segmented feature in the generated picture.
- Intervention: By zeroing out channels, you can remove windows or trees from the generated image. Alternatively you can add windows or trees at specific locations by activating the neurons at that location.
- This is fairly specific to CNN architectures where there is a locality correspondence between the intermediate representations and the output image.