Unsupervised Learning: Difference between revisions
No edit summary |
|||
(One intermediate revision by the same user not shown) | |||
Line 126: | Line 126: | ||
<math>= \min_{\beta} \sum_{j=1}^{k} \left[\log(\frac{-1}{\beta}(\sum_{i}Q^{(i)}_{(j)})) \sum_{i=1}^{m} Q^{(i)}_{(j)} -(\sum_{i}Q^{(i)}_{(j)}) - (\beta/k) \right]</math><br> | <math>= \min_{\beta} \sum_{j=1}^{k} \left[\log(\frac{-1}{\beta}(\sum_{i}Q^{(i)}_{(j)})) \sum_{i=1}^{m} Q^{(i)}_{(j)} -(\sum_{i}Q^{(i)}_{(j)}) - (\beta/k) \right]</math><br> | ||
Taking the derivative with respect to <math>\beta</math>, we get:<br> | Taking the derivative with respect to <math>\beta</math>, we get:<br> | ||
<math> | <math display="block"> | ||
\sum_{j=1}^{k} [(\frac{1}{(-1/\beta)(\sum Q)})(-\sum Q)(-\beta^{-2})(\sum Q) - \frac{1}{k}] | \begin{aligned} | ||
\sum_{j=1}^{k} [(\frac{1}{(-1/\beta)(\sum Q)})(-\sum Q)(-\beta^{-2})(\sum Q) - \frac{1}{k}] | |||
&= \sum_{j=1}^{k} [ (\beta)(-\beta^{-2})(\sum Q) - \frac{1}{k}] \\ | |||
=\sum_{j=1}^{k} [ (\beta)(-\beta^{-2})(\sum Q) - \frac{1}{k}] | &= \sum_{j=1}^{k} [\frac{-1}{\beta}(\sum_{i=1}^{m} Q) - \frac{1}{k}]\\ | ||
&= [\sum_{i=1}^{m} \frac{-1}{\beta} \sum_{j=1}^{k}P(z^{(i)} = j | x^{(i)}) - \sum_{j=1}^{k}\frac{1}{k}]\\ | |||
&= [\frac{-1}{\beta}\sum_{i=1}^{m}1 - 1]\\ | |||
= | &= \frac{-m}{\beta} - 1 = 0\\ | ||
\implies \beta &= -m | |||
\end{aligned} | |||
=[\sum_{i=1}^{m} \frac{-1}{\beta} \sum_{j=1}^{k}P(z^{(i)} = j | x^{(i)}) - \sum_{j=1}^{k}\frac{1}{k}] | |||
= | |||
= \frac{-m}{\beta} - 1 = 0 | |||
\implies \beta = -m | |||
</math><br> | </math><br> | ||
Plugging in <math>\beta = -m</math> into our equation for <math>\phi_j</math> we get <math>\phi_j = \frac{1}{m}\sum_{i=1}^{m}Q^{(i)}_{(j)}</math> | Plugging in <math>\beta = -m</math> into our equation for <math>\phi_j</math> we get <math>\phi_j = \frac{1}{m}\sum_{i=1}^{m}Q^{(i)}_{(j)}</math> | ||
Line 186: | Line 176: | ||
\begin{aligned} | \begin{aligned} | ||
KL(Q_i(z) \Vert P(z|X)) &= E_{Q} \left[ \log(\frac{Q_i(z)}{P(z|X)}) \right]\\ | KL(Q_i(z) \Vert P(z|X)) &= E_{Q} \left[ \log(\frac{Q_i(z)}{P(z|X)}) \right]\\ | ||
&=E_Q(\log(\frac{Q_i(z) P(X^{(i)})}{P( | &=E_Q(\log(\frac{Q_i(z) P(X^{(i)})}{P(X|z)P(z)})\\ | ||
&=E_Q(\log(\frac{Q_i(z)}{P(z)})) + \log(P(x^{(i)})) - E_Q(\log(P(X|z))\\ | &=E_Q(\log(\frac{Q_i(z)}{P(z)})) + \log(P(x^{(i)})) - E_Q(\log(P(X|z))\\ | ||
&=KL(Q_i(z) \Vert P(z)) + \log(P(x^{(i)}) - E_Q(\log(P(X|z)) | &=KL(Q_i(z) \Vert P(z)) + \log(P(x^{(i)}) - E_Q(\log(P(X|z)) |