Deep Learning: Difference between revisions

Deep Learning (view source)

217 bytes added , 1 September 2020

5,337

edits

@@ Line 33: / Line 33: @@
 ===Optimization===
-Apply gradient descent to find \(W^*\).
+Apply gradient descent or stochastic gradient descent to find \(W^*\).
+Stochastic GD:
+# Sample some batch <math display="inline">B</math>
+# <math display="inline">w^{(t+1)} = w^{(t)} - \eta \frac{1}{|B|} \sum_{i \in B} \nabla_{W} l(f_{W}(x_i), y_i)</math>
 ==Misc==