2,310

edits

Jump to navigation
Jump to search
##
Unsupervised Learning
(view source)

###
Revision as of 08:49, 4 December 2019

1,153 bytes added
, 08:49, 4 December 2019

→Dimension Reduction

* KL is always >= 0

* KL is not symmetric

* Jensen-Shannon Divergence

** <math>JSD(P \Vert Q) = \frac{1}{2}KL(P \Vert Q) + \frac{1}{2}KL(Q \Vert P)</math>

** This is symmetric

====Model====

The main idea is to ensure the that discriminator is lipschitz continuous and to limit the lipschitz constant (i.e. the derivative) of the discriminator.<br>

If the correct answer is 1.0 and the generator produces 1.0001, we don't want the discriminator to give us a very high loss.<br>

====Earth mover's distance====

{{main | wikipedia:earth mover's distance}}

The minimum cost of converting one pile of dirt to another.<br>

Where cost is the cost of moving (amount * distance)<br>

Given a set <math>P</math> with m clusters and a set <math>Q</math> with n clusters:<br>

...

<math>EMD(P, Q) = \frac{\sum_{i=1}^{m}\sum_{j=1}^{n}f_{i,j}d_{i,j}}{\sum_{i=1}^{m}\sum_{j=1}^{n}f_{i,j}}</math><br>

;Notes

* Also known as Wasserstein metric

==Dimension Reduction==

Goal: Reduce the dimension of a dataset.<br>

If each example <math>x \in \mathbb{R}^n</math>, we want to reduce each example to be in <math>\mathbb{R}^r</math> where <math>r < n</math>

===PCA===

Principal Component Analysis<br>

Preprocessing: Subtract the sample mean from each example so that the new sample mean is 0.<br>

Goal: Find a vector <math>v_1</math> such that the projection <math>v_1 \cdot x</math> has maximum variance.<br>

These principal components are the eigenvectors of <math>X^TX</math>.<br>

===Kernel PCA===

===Autoencoder===

Retrieved from "https://wiki.davidl.me/view/Special:MobileDiff/2167...2175"