Unsupervised Learning: Difference between revisions

Line 19: Line 19:
Taking the gradient and setting it to 0 we get:<br>
Taking the gradient and setting it to 0 we get:<br>
<math>
<math>
kmo
\nabla L(\mu, \mathbf{z}) &= \nabla \sum_{i} \Vert x^{(i)} - \mu_{z^{(i)}} \Vert ^2
\begin{align}
</math><br>
\nabla L(\mu, \mathbf{z}) &= \nabla \sum_{i} \Vert x^{(i)} - \mu_{z^{(i)}} \Vert ^2\\
<math>
&= \nabla \sum_{j=1}{k} \sum_{i\mid z(i)=j} \Vert x^{(i)} - \mu_{z^{(i)}} \Vert ^2\\
&= \nabla \sum_{j=1}{k} \sum_{i\mid z(i)=j} \Vert x^{(i)} - \mu_{z^{(i)}} \Vert ^2
&= \nabla \sum_{j=1}{k} \sum_{i\mid z(i)=j} \Vert x^{(i)} - \mu_{j} \Vert ^2\\
</math><br>
&= \sum_{j=1}{k} \sum_{i\mid z(i)=j}  \nabla  \Vert x^{(i)} - \mu_{j} \Vert ^2\\
<math>
&=  \sum_{j=1}{k} \sum_{i\mid z(i)=j} \nabla \Vert x^{(i)} - \mu_{j} \Vert ^2\\
&= \nabla \sum_{j=1}{k} \sum_{i\mid z(i)=j} \Vert x^{(i)} - \mu_{j} \Vert ^2
&=  \sum_{j=1}{k} \sum_{i\mid z(i)=j} 2(x^{(i)} - \mu_{j})\\
</math><br>
<math>
&= \sum_{j=1}{k} \sum_{i\mid z(i)=j}  \nabla  \Vert x^{(i)} - \mu_{j} \Vert ^2
</math><br>
<math>
&=  \sum_{j=1}{k} \sum_{i\mid z(i)=j} \nabla \Vert x^{(i)} - \mu_{j} \Vert ^2
</math><br>
<math>
&=  \sum_{j=1}{k} \sum_{i\mid z(i)=j} 2(x^{(i)} - \mu_{j})
</math><br>
<math>
\implies \mu_{j} &= (\sum_{i\mid z(i)=j} x^{(i)})/(\sum_{i\mid z(i)=j} 1) \quad \forall j
\implies \mu_{j} &= (\sum_{i\mid z(i)=j} x^{(i)})/(\sum_{i\mid z(i)=j} 1) \quad \forall j
\end{align}
</math>
</math>