5,337
edits
(→Basics) |
|||
Line 33: | Line 33: | ||
===Optimization=== | ===Optimization=== | ||
Apply gradient descent to find \(W^*\). | Apply gradient descent or stochastic gradient descent to find \(W^*\). | ||
Stochastic GD: | |||
# Sample some batch <math display="inline">B</math> | |||
# <math display="inline">w^{(t+1)} = w^{(t)} - \eta \frac{1}{|B|} \sum_{i \in B} \nabla_{W} l(f_{W}(x_i), y_i)</math> | |||
==Misc== | ==Misc== |