Deep Learning: Difference between revisions

Line 43: Line 43:
* RMSProp
* RMSProp
* Adam
* Adam
==DL Optimization==
The role of "over-parameterization". 
In general, you can have poor local minimums and saddle points (with pos+neg Hessian). 
However, in practice GD & SGD work pretty well. 
Lecture 2 (Sept 3) is about Liu ''et al.'' <ref name="liu2020towards></ref>


==Misc==
==Misc==