Machine Learning: Difference between revisions

Line 74: Line 74:
\min_{w}\max_{\alpha, \beta \mid \alpha \geq 0} \mathcal{L}(w, \alpha, \beta)
\min_{w}\max_{\alpha, \beta \mid \alpha \geq 0} \mathcal{L}(w, \alpha, \beta)
</math><br>
</math><br>
where <math>\mathcal{L}(w, \alpha, \beta) = f(w) + \sum \alpha_i g_i(w) + \sum \beta_i h_i(w)</math> is called the lagrangian.
where <math>\mathcal{L}(w, \alpha, \beta) = f(w) + \sum \alpha_i g_i(w) + \sum \beta_i h_i(w)</math> is called the lagrangian.<br>
Since <math>\min \max f \leq \max \min f</math>,<br>
we have:<br>
<math>
\min_{w}\max_{\alpha, \beta \mid \alpha \geq 0} \mathcal{L}(w, \alpha, \beta) \leq \max_{\alpha, \beta \mid \alpha \geq 0}\min_{w} \mathcal{L}(w, \alpha, \beta)
</math><br>
The left term is called the dual problem.<br>
If the solution to the dual problem satisfy some conditions called the KKT conditions, then it is also the same as the original problem.


===Kernel Trick===
===Kernel Trick===