Machine Learning: Difference between revisions

Machine Learning (view source)

880 bytes removed , 10 November 2019

no edit summary

5,321

edits

@@ Line 1: / Line 1: @@
 Machine Learning
-==Loss functions==
+==Loss functisons==
 ===(Mean) Squared Error===
 The squared error is:<br>
@@ Line 23: / Line 23: @@
 ;Notes
 * If our model is <math>g(\theta^Tx^{(i)})</math> where <math>g(x)</math> is the sigmoid function <math>\frac{e^x}{1+e^x}</math> then this is convex
-{{hidden | Proof |
-<math>
-\begin{aligned}
-\nabla_\theta J(\theta) &= -\nabla_\theta \sum [(y^{(i)})\log(g(\theta^t x^{(i)})) + (1-y^{(i)})\log(1-g(\theta^t x^{(i)}))]\\
-&= -\sum [(y^{(i)})\frac{g(\theta^t x^{(i)})(1-g(\theta^t x^{(i)}))}{g(\theta^t x^{(i)})}x^{(i)} + (1-y^{(i)})\frac{-g(\theta^t x^{(i)})(1-g(\theta^t x^{(i)}))}{1-g(\theta^t x^{(i)})}x^{(i)}]\\
-&= -\sum [(y^{(i)})(1-g(\theta^t x^{(i)}))x^{(i)} - (1-y^{(i)})g(\theta^t x^{(i)})x^{(i)}]\\
-&= -\sum [(y^{(i)})x^{(i)} -(y^{(i)}) g(\theta^t x^{(i)}))x^{(i)} - g(\theta^t x^{(i)})x^{(i)} + y^{(i)}g(\theta^t x^{(i)})x^{(i)}]\\
-&= -\sum [(y^{(i)})x^{(i)} - g(\theta^t x^{(i)})x^{(i)}]\\
-\implies \nabla^2_\theta J(\theta) &= \nabla_\theta -\sum [(y^{(i)})x^{(i)} - g(\theta^t x^{(i)})x^{(i)}]\\
-&= \sum g(\theta^t x^{(i)})(1-g(\theta^t x^{(i)})) x^{(i)} (x^{(i)})^T\\
-\end{aligned}
-</math><br>
-which is a PSD matrix
-}}
 ===Hinge Loss===