5,337
edits
Line 79: | Line 79: | ||
Instead of convexity, we use PL-condition (Polyak-Lojasiewicz, 1963): | Instead of convexity, we use PL-condition (Polyak-Lojasiewicz, 1963): | ||
For <math>w \in B</math>, <math>\frac{1}{2}\Vert \nabla L(w) \Vert^2 \ | For <math>w \in B</math>, <math>\frac{1}{2}\Vert \nabla L(w) \Vert^2 \geq \mu L(w)</math> which implies exponential (linear) convergence of GD. | ||
===Tangent Kernels=== | ===Tangent Kernels=== |