Deep Learning: Difference between revisions

← Older edit Newer edit →

@@ Line 1,062: / Line 1,062: @@
 The dual of <math>\min_{G} W_1(P_Y, P_{\hat{Y}})</math> is <math>\min_{G} \max_{D} \left[ E[D(Y)] - E[D(\hat{Y})] \right]</math>.
 The lipschitz of the discriminator can be enforced by weight clipping.
+===How to evaluate GANs?===
+;Inception Score
+Use a pre-trained network (Inception-v3) to map a generated image to its probabilities.
+<math>IS(G) = \exp \left( E_{x \sim P_{\hat{X}}} KL( p(y|x) \Vert p(y) ) \right)</math>
+Mutual Information interpretation:
+<math>\log(IS(G)) = I(G(Z);y) = H(y) - H(y|G(z))</math>
+* The first term <math>H(y)</math> represents diverse labels.
+* The second score represents high confidence.
+IS is misleading if it only generates one image per class.
+;FID Score
+Use a pre-trained network (Inception) to extract features from an intermediate layer.
+Then model the data distribution using multivariate Gaussian with mean <math>\mu</math> and covariance <math>\Sigma</math>.
+FID is Frechet Inception Distance.
+<math>FID(x, y) = \Vert \mu_{x} - \mu_{g} \Vert_2^2 + Tr(\Sigma_{x} + \Sigma_g - 2(\Sigma_x \Sigma_g)^{1/2})</math>
+===A Statistical Approach to GANs===
+GANs do not have explicit probability models.
+This is in contrast to maximum-likelihood models like VAEs.
+GANs focus on minimizing distance between distributions.
+This yields high-quality samples but inability to sample likelihoods.
+VAEs maximize lower bound on likelihood. However, you get blurry samples.
+The key idea is to have an explicit model for the data:
+<math>f_{Y}(y|X=x) ~ exp(-l(y, G(x))/\lambda)</math>
+;Theorem (BHCF 2019)
+...
+Entropic GANs meat VAEs.
+===Distributionally Robust Wasserstein===
+Robust Wasserstein:
+<math>
+\begin{aligned}
+\min_{P_{\tilde{X}}, P_{\tilde{Y}}}
+\end{aligned}
+</math>
 ==Misc==