Deep Learning: Difference between revisions

Deep Learning (view source)

893 bytes added , 20 October 2020

5,337

edits

@@ Line 1,266: / Line 1,266: @@
 Lemma: If linearly stable but <math>\rho(J(\theta^*)) < 1</math> then asymptotic stability.
-;Def Strongly local min-max
+====Strongly local min-max====
+Definition:
 <math>
 \begin{cases}
@@ Line 1,272: / Line 1,273: @@
 \lambda_{max}(\nabla^2_{yy} f) < 0
 \end{cases}
+</math>
+Simultaneous GDA:
+<math>H =
+\begin{pmatrix}
+- \nabla_{xx}^2 f & -\nabla_{xy}^2 f\\
+\nabla_{xy}^2 f & \nabla_{yy}^2 f\\
+\end{pmatrix}
+</math>
+Consider <math>\theta^*</math> is a local min-max. Then both of the diagonal matrices (<math>-\nabla^2_{xx}</math> and <math>\nabla^2_{yy} f</math>) will be negative semi definite.
+Lemma:
+Eigenvalues of the hessian matrix will not have a positive real part: <math>Re(\lambda(H)) < 0</math>.
+Why?
+<math>
+\begin{pmatrix}
+A & B\\
+-B^T & C
+\end{pmatrix}
+\begin{pmatrix}
+v \\ u
+\end{pmatrix}
+=
+\lambda
+\begin{pmatrix}
+v \\ u
+\end{pmatrix}
+</math>
+Summing up both results in:
+<math>
+\begin{aligned}
+&(v^H A v + u^H C u) + (v^H B u - u^H B^T v) = \lambda (\Vert v \Vert^2 + \Vert u \Vert^2)\\
+\implies &Re(v^H A v + u^H C u) = Re(\lambda)(\Vert v \Vert^2 + \Vert u \Vert^2) < 0\\
+\implies &Re(\lambda) < 0
+\end{aligned}
 </math>