Machine Learning: Difference between revisions

Machine Learning (view source)

551 bytes added , 9 November 2019

no edit summary

5,337

edits

@@ Line 6: / Line 6: @@
 ===Batch Size===
 [https://medium.com/mini-distill/effect-of-batch-size-on-training-dynamics-21c14f7a716e A medium post empirically evaluating the effect of batch_size]
+==Loss functions==
+===(Mean) Squared Error===
+The squared error is:<br>
+<math>J(\theta) = \sum|h_{\theta}(x^{(i)}) - y^(i)|^2</math><br>
+If our model is linear regression <math>h(x)=w^tx</math> then this is convex.<br>
+{{hidden|Proof|
+<math>
+\begin{aligned}
+\nabla_{w} J(w) &= \nabla \sum(w^tx^{(i)} - y^{(i)})^2\\
+&= 2\sum(w^tx^{(i)} - y^(i))x \\
+\implies \nabla_{w}^2 J(w) &= \nabla 2\sum(w^Tx^{(i)}-y^{(i)})x^{(i)}\\
+&= 2\sumx^{(i)}x^{(i)}^T
+\end{aligned}
+so the hessian is positive semi-definite
+</math>
+}}
+===Cross Entropy===
+===Hinge Loss===
 ==Optimization==