Unsupervised Learning: Difference between revisions

Line 58: Line 58:
By introducing an extra variable <math>Q^{(i)}_{(j)}</math> we'll be able to apply Jensen's inequality to the concave log function.<br>
By introducing an extra variable <math>Q^{(i)}_{(j)}</math> we'll be able to apply Jensen's inequality to the concave log function.<br>
Assume <math>Q^{(i)}{(j)}</math> is a probability mass function.<br>
Assume <math>Q^{(i)}{(j)}</math> is a probability mass function.<br>
<math>
<math display="block">
l(\theta) = \sum_{i=1}^{m} \log \sum_{j=1}^{k}P(x^{(i)}, z^{(i)}=j; \theta)
\begin{align*}
</math><br>
l(\theta) &= \sum_{i=1}^{m} \log \sum_{j=1}^{k}P(x^{(i)}, z^{(i)}=j; \theta)\\
<math>
&=\sum_{i=1}^{m} \log \sum_{j=1}^{k} \frac{Q^{(i)}_{(j)}}{Q^{(i)}_{(j)}} P(x^{(i)}, z^{(i)}=j; \theta)\\
=\sum_{i=1}^{m} \log \sum_{j=1}^{k} \frac{Q^{(i)}_{(j)}}{Q^{(i)}_{(j)}} P(x^{(i)}, z^{(i)}=j; \theta)
&\geq \sum_{i=1}^{m} \sum_{j=1}^{k} Q^{(i)}_{(j)} \log(\frac{P(x^{(i)}, z^{(i)}=j; \theta)
</math><br>
}{Q^{(i)}_{(j)}})\\
<math>
\implies \log\left[E_{Q}\left(\frac{P(x^{(i)}, q^{(i)}; \theta)}{Q^{(i)}_{(j)}}\right)\right] &\geq E_{Q} \left[ \log \left(\frac{P(x^{(i)}, q^{(i)}; \theta)}{Q^{(i)}(j)} \right)\right]
\geq \sum_{i=1}^{m} \sum_{j=1}^{k} Q^{(i)}_{(j)} \log(\frac{P(x^{(i)}, z^{(i)}=j; \theta)
\end{align*}
}{Q^{(i)}_{(j)}})
</math><br>
<math>
\implies \log\left[E_{Q}\left(\frac{P(x^{(i)}, q^{(i)}; \theta)}{Q^{(i)}_{(j)}}\right)\right] \geq E_{Q} \left[ \log \left(\frac{P(x^{(i)}, q^{(i)}; \theta)}{Q^{(i)}(j)} \right)\right]
</math><br>
</math><br>
Let <math>J(\theta, Q) = \sum_{i=1}^{m} \sum_{j=1}^{k} Q^{(i)}_{(j)} \log \left( \frac{P(x^{(i)}, z^{(i)}=j;\theta)}{Q^{(i)}(j)} \right)</math><br>
Let <math>J(\theta, Q) = \sum_{i=1}^{m} \sum_{j=1}^{k} Q^{(i)}_{(j)} \log \left( \frac{P(x^{(i)}, z^{(i)}=j;\theta)}{Q^{(i)}(j)} \right)</math><br>
The EM algorithm is an iterative algorithm which alternates between an E-Step and an M-Step.
The EM algorithm is an iterative algorithm which alternates between an E-Step and an M-Step.
=====E-Step=====
=====E-Step=====
We will fix <math>\theta</math> and maximize J wrt <math>Q</math>.<br>
We will fix <math>\theta</math> and maximize J wrt <math>Q</math>.<br>