Deep Learning: Difference between revisions

Deep Learning (view source)

242 bytes added , 8 October 2020

5,337

edits

@@ Line 994: / Line 994: @@
 We use SGD to optimize <math>\theta, \phi</math>.
 Using the reparameterization trick, <math>z = \mu + \Sigma^{1/2}\epsilon</math> for <math>\epsilon \sim N(0, I)</math>.
+;ELBO
+<math>\max_{\theta, \phi} E_{z \sim q}[\log P(x|z)] - KL(q(z|x) \Vert P(z))</math>
+Issue: Posterior collapse.
+In practice, sometimes the posterior <math>q</math> does not depend on x: <math>q(z|x) \approx q(z)</math>.
+===β-VAE===
 ==Misc==