Unsupervised Learning: Difference between revisions

Unsupervised Learning (view source)

337 bytes added , 18 November 2019

5,321

edits

@@ Line 99: / Line 99: @@
 Taking the derivative wrt Q we get:<br>
 <math>
-\frac{\partial}{\partial Q^{(i)}_{(j)}} \sum_{j=1}^{k} Q^{(i)}_{(j)} \log \left( \frac{Pr(x^{(i)}, z^{(i)}=j;\theta)}{Q^{(i)}(j)} \right) +\beta (\sum_j Q^{(i)}_{(j)} - 1)</math><br>
+\frac{\partial}{\partial Q^{(i)}_{(j)}} \sum_{j=1}^{k} Q^{(i)}_{(j)} \log \left( \frac{Pr(x^{(i)}, z^{(i)}=j;\theta)}{Q^{(i)}(j)} \right) +\beta (\sum_j Q^{(i)}_{(j)} - 1)
+</math><br>
 <math>
 = \log(\frac{Pr(x^{(i)}, z^{(i)}=j;\theta)}{Q}) - Q \frac{Q}{Pr(x^{(i)}, z^{(i)}=j;\theta)} (Pr(x^{(i)}, z^{(i)}=j;\theta))(Q^{-2}) + \beta
 </math>
+</math><br>
+<math>
+= \log(\frac{Pr(x^{(i)}, z^{(i)}=j;\theta)}{Q}) - 1 + \beta = 0
+</math><br>
+<math>
+\implies Q^{(i)}_{(j)} = (\frac{1}{exp(1-\beta)})Pr(x^{(i)}, z^{(i)}=j;\theta)
+</math><br>
+Since Q is a pmf, we know it sums to 1 so we get the same result replacing <math>(\frac{1}{exp(1-\beta)})</math> with <math>Pr(x^{(i)}</math>
 }}
 =====M-Step=====
 We will fix <math>Q</math> and maximize J wrt <math>\theta</math>