Deep Learning: Difference between revisions

Line 223: Line 223:
GD converges even though our model does not go to a linear model.
GD converges even though our model does not go to a linear model.


===Take-away===
===Takeaway===
Over-parameterization does not lead to linearization.  
Over-parameterization does not lead to linearization.  
Over-parameterization leads to good conditioning which leads to PL and convergence of GD/SGD.
Over-parameterization leads to good conditioning which leads to PL and convergence of GD/SGD.