Deep Learning: Difference between revisions
Line 1,002: | Line 1,002: | ||
===β-VAE=== | ===β-VAE=== | ||
VAEs have many design choices: | |||
* Prior distribution <math>P(z)</math> chosen to be normal. | |||
* Posterior distribution <math>q(z|x)</math> chosen to be <math>N(f(x), \sigma^2 I)</math>. | |||
However this often leads to blurry images. | |||
One way to address this is to increase the expressiveness of the prior and posterior distributions. | |||
This leads to Hierarchical VAEs. | |||
The idea is that latent variables are partitioned to disjoint groups: | |||
<math>z = \{z_1, ..., z_L\}</math> | |||
<math>P(z) = \prod_{l}P(z_l | z_{<l})</math> | |||
<math>q(z|x) = \prod_{l}q(z_l | z_{<l}, x)</math> | |||
Vandet et al. create NVAE which is Hierarchical VAE + some tricks. | |||
VQ-VAE (Vector quantized VAE) perform quantization of the latent space. | |||
The quantization is non differentiable but they can copy the gradients. | |||
==Misc== | ==Misc== |