Deep Learning: Difference between revisions

Line 33: Line 33:


===Optimization===
===Optimization===
Apply gradient descent to find \(W^*\).
Apply gradient descent or stochastic gradient descent to find \(W^*\).
 
Stochastic GD:
# Sample some batch <math display="inline">B</math>
# <math display="inline">w^{(t+1)} = w^{(t)} - \eta \frac{1}{|B|} \sum_{i \in B} \nabla_{W} l(f_{W}(x_i), y_i)</math>


==Misc==
==Misc==