Deep Learning: Difference between revisions

Line 1,458: Line 1,458:
Fitting a ''density'' function to discrete data can have crazy peaks.   
Fitting a ''density'' function to discrete data can have crazy peaks.   
For dequantization, add a uniform to the data to get a more stable density function.
For dequantization, add a uniform to the data to get a more stable density function.
<math>
\begin{aligned}
\log \int p(x_i +\delta) p(\delta)ds &= \log E_\delta [\log p(x_i + \delta)]\\
&\geq E_{\delta}[\log p(x_i + \delta)]\\
&\approx \log p(x_i + \delta)
\end{aligned}
</math>
* We have exact likelihood estimations.
** We can use out-of-distribution anomaly detection.
However in practice after training on CIFAR, the liklihood of MNIST is higher. 
This behavior is not specific to flow-based models. 
Suppose <math>P_{\theta}</math> is <math>N(0, I_d)</math>. 
Our typical example has <math>\Vert x_i \Vert^2 = O(\d)</math>. 
Consider <math>x^{test} = 0</math> then <math>P_{\theta}(x^{test}) > P_{\theta}(x_1)</math>.


==Misc==
==Misc==