Machine Learning: Difference between revisions

Machine Learning (view source)

1,054 bytes added , 9 November 2019

5,337

edits

@@ Line 27: / Line 27: @@
 ===Learning Rate===
 ==SVM==
+[http://cs229.stanford.edu/notes/cs229-notes3.pdf Andrew Ng Notes]<br>
+Support Vector Machine<br>
+This is a linear classifier the same as a perceptron except the goal is not to just classify our data properly but to also maximize the margin.<br>
+<math>h_{w,b}(x) = g(W*x+b)</math> where <math>g(x) = I[x>=0]-I[x<0]</math> is the sign function.<br>
+===Margins===
+The margin denoted by <math>\gamma</math> is the distance between our classifier and the closest point.<br>
+;Functional Margin
+The margin corresponding to one example is:<br>
+<math>\hat{\gamma}^{(i)} = y^{(i)}(w^Tx^{(i)}+b)</math>
+The margin for our entire sample is the smallest margin per example.
+;Geometric Margin
+The geometric margin is the actual distance.<br>
+<math>\hat{\gamma}^{(i)} = y^{(i)}((\frac{w}{\Vert w \Vert})^Tx^{(i)}+\frac{b}{|b|})</math><br>
+* <math>\mathbf{w}</math> is the normal vector of our hyperplane so \frac{w}{\Vert w \Vert})^Tx^{(i)} is the length of the projection of x onto our normal vector.
+: This is the distance to our hyperplane.
+===Lagrangians===
 ===Kernel Trick===
 Oftentimes, using linear classifiers such as perceptron and SVM may fail to classify data for which the true decision boundary is non-linear.<br>