Deep Learning: Difference between revisions
Line 1,241: | Line 1,241: | ||
===Nonconvex-nonconcave min-max opt=== | ===Nonconvex-nonconcave min-max opt=== | ||
The goal is to find a local saddle point. | |||
;Stability | |||
If we drift away from <math>(x^*,y^*)</math> then the optimization is unstable. | |||
If we remain close, the optimization is stable even if we never converge. | |||
;Asymptotic Stability | |||
If dynamics start close enough to <math>\theta^*</math> it remains close. | |||
If dynamics converges to <math>\theta^*</math>, it is locally asymptotically stable. | |||
Recall <math>\theta_{t+1} = F(\theta_t) = \theta_t + \eta \overrightarrow{g}(\theta_t)</math>. | |||
Jacobian of f: <math>J(\theta) = I + \eta H(\theta)</math>. | |||
where the Hessian is <math>H(\theta) = | |||
\begin{pmatrix} | |||
- \nabla_{xx}^2 f & -\nabla_{xy}^2 f\\ | |||
\nabla_{xy}^2 f & \nabla_{yy}^2 f\\ | |||
\end{pmatrix} | |||
</math> | |||
(Linear) stability: a fixed point <math>\theta^*</math> is stable if | |||
<math>| \lambda_{\max}(J(\theta^*)) | = \rho(J(\theta^*)) \leq 1</math>. | |||
Lemma: If linearly stable but <math>\rho(J(\theta^*)) < 1</math> then asymptotic stability. | |||
==Misc== | ==Misc== |