Machine Learning: Difference between revisions

Machine Learning (view source)

684 bytes added , 9 November 2019

5,337

edits

@@ Line 48: / Line 48: @@
 ===Lagrangians===
-The goal for svm is to maximize the margin:
+The goal for svm is to maximize the margin:<br>
 <math>
 \begin{aligned}
-\max_{\gamma, w, b\} &\frac{\gamma}{\Vert \gamma \Vert}\\
+\max_{\hat{\gamma}, w, b} &\frac{\hat{\gamma}}{\Vert w \Vert}\\
-\text{s.t.}& y^{(i)}(w^Tx^{(i)} + b) \geq \gamma \quad \forall i
+\text{s.t. }& y^{(i)}(w^Tx^{(i)} + b) \geq \hat{\gamma} \quad \forall i
 \end{aligned}
-</math>
+</math><br>
+which is equivalent to by setting <math>\hat{\gamma}=1</math>
+<math>
+\begin{aligned}
+\min_{\gamma, w, b} &\Vert w \Vert ^2\\
+\text{s.t. }& y^{(i)}(w^Tx^{(i)} + b) \geq 1 \quad \forall i
+\end{aligned}
+</math><br><br>
+In general, given an optimization in the (primal) form:<br>
+<math>
+\begin{aligned}
+\min_w & f(w)\\
+\text{s.t. }& h_i(w) \leq 0 \quad \forall i\\
+& g_i(w) = 0
+\end{aligned}
+</math><br>
+we can rewrite the optimization as <br>
+<math>
+\min_{w}\max_{\alpha, \beta \mid \alpha \geq 0} \mathcal{L}(w, \alpha, \beta)
+</math><br>
+where <math>\mathcal{L}(w, \alpha, \beta) = f(w) + \sum \alpha_i g_i(w) + \sum \beta_i h_i(w)</math> is called the lagrangian.
 ===Kernel Trick===