Unsupervised Learning: Difference between revisions

Line 75: Line 75:
\geq \sum_{i=1}^{m} \sum_{j=1}^{k} Q^{(i)}_{(j)} \log(\frac{P(x^{(i)}, z^{(i)}=j; \theta)
\geq \sum_{i=1}^{m} \sum_{j=1}^{k} Q^{(i)}_{(j)} \log(\frac{P(x^{(i)}, z^{(i)}=j; \theta)
}{Q^{(i)}_{(j)}})
}{Q^{(i)}_{(j)}})
</math>
</math><br>
<math>
<math>
\implies  
\implies \log[E_{Q}(\frac{Pr(x^{(i)}, z^{(i)}; \theta)}{Q})] \geq E_{Q}[\log(\frac{Pr(X^{(i)}, q^{(i)}; \theta)}{Q^{(i)}(j)}]
</math>
</math><br>
; E-Step
We will fix <math>\theta</math> and optimize wrt <math>Q</math>
Jensen's inequality holds with equality iff either the function is linear or if the random variable is degenerate.<br>
Jensen's inequality holds with equality iff either the function is linear or if the random variable is degenerate.<br>
Since log is not linear, we will assume <math>\frac{P(x^{(i)}, z^{(i)}=j; \theta)
Since log is not linear, we will assume <math>\frac{P(x^{(i)}, z^{(i)}=j; \theta)
}{Q^{(i)}_{(j)}}</math> is a constant.<br>
}{Q^{(i)}_{(j)}}</math> is a constant.<br>
This implies <math>Q^(i)(j) = c * P(x^{(i)}, z^{(i)} = j ; \theta)</math>.<br>
This implies <math>Q^{(i)}(j) = c * P(x^{(i)}, z^{(i)} = j ; \theta)</math>.<br>
Since Q is a pmf, we have <math>Q^{(i)}(j) = \frac{1}{P(x^({i})} * P(x^{(i)}, z^{(i)} = j ; \theta) = P(z^{(i)} ; x^{(i)}, \theta)</math>
Since Q is a pmf, we have <math>Q^{(i)}(j) = \frac{1}{P(x^{(i)})} * P(x^{(i)}, z^{(i)} = j ; \theta) = P(z^{(i)} ; x^{(i)}, \theta)</math><br>
; M-Step
We will fix <math>Q</math> and optimize wrt <math>\theta</math>