Unsupervised Learning: Difference between revisions

Line 19: Line 19:
Taking the gradient and setting it to 0 we get:<br>
Taking the gradient and setting it to 0 we get:<br>
<math>
<math>
\nabla L(\mu, \mathbf{z}) = \nabla \sum_{i} \Vert x^{(i)} - \mu_{z^{(i)}} \Vert ^2
\nabla_{\mu} L(\mu, \mathbf{z}) = \nabla_{\mu} \sum_{i} \Vert x^{(i)} - \mu_{z^{(i)}} \Vert ^2
</math><br>
</math><br>
<math>
<math>
= \nabla \sum_{j=1}^{k} \sum_{i\mid z(i)=j} \Vert x^{(i)} - \mu_{z^{(i)}} \Vert ^2
= \nabla_{\mu} \sum_{j=1}^{k} \sum_{i\mid z(i)=j} \Vert x^{(i)} - \mu_{z^{(i)}} \Vert ^2
</math><br>
</math><br>
<math>
<math>
= \nabla \sum_{j=1}^{k} \sum_{i\mid z(i)=j} \Vert x^{(i)} - \mu_{j} \Vert ^2
= \nabla_{\mu} \sum_{j=1}^{k} \sum_{i\mid z(i)=j} \Vert x^{(i)} - \mu_{j} \Vert ^2
</math><br>
</math><br>
<math>
<math>
=  \sum_{j=1}^{k} \sum_{i\mid z(i)=j} \nabla \Vert x^{(i)} - \mu_{j} \Vert ^2
=  \sum_{j=1}^{k} \sum_{i\mid z(i)=j} \nabla_{\mu} \Vert x^{(i)} - \mu_{j} \Vert ^2
</math><br>
</math><br>
<math>
<math>