5,321
edits
(→Misc) |
|||
Line 497: | Line 497: | ||
* Sharpness-based bounds (PAC-Bayesian) | * Sharpness-based bounds (PAC-Bayesian) | ||
* Norm-based bounds | * Norm-based bounds | ||
==Neural Tangent Kernels (NTKs)== | |||
Beginning of Lecture 7 (Sept. 22, 2020) | |||
===Linear Regression=== | |||
Assume we have a dataset: | |||
<math>\{(x_i, y_i)\}_{i=1}^{n}</math> | |||
<math>y_i \in \mathbb{R}</math> | |||
<math>x_i \in \mathbb{R}^d</math> | |||
<math>f(w, x) = w^t x</math> | |||
<math>L(w) = \frac{1}{2} \sum_{i=1}^{n}(y_i - f(w, x_i))^2</math> | |||
<math>\min_{W} L(w)</math> | |||
GD: <math>w(t+1) = w(t) - \eta_{t} \nabla L(w_t)</math> where our gradient is: | |||
<math>\sum_{i=1}^{n}(y_i - f(w, x_i)) \nabla_{w} f(w_t, x_i) = \sum_{i=1}^{n}(y_i - f(w, x_i)) x_i</math> | |||
===Kernel Method=== | |||
<math>x_i \in \mathbb{R}^d \to \phi(x_i) \in \mathbb{R}^{D}</math> with <math>D >> d</math> | |||
Suppose <math>d=3</math> and <math>x = | |||
\begin{bmatrix} | |||
x_1\\x_2\\x_3 | |||
\end{bmatrix} | |||
\to | |||
\phi(x)= | |||
\begin{bmatrix} | |||
x_1\\ | |||
x_2\\ | |||
x_3\\ | |||
x_1 x_2\\ | |||
x_1 x_3\\ | |||
x_2 x_3 | |||
\end{bmatrix} | |||
</math> | |||
Is this model linear in w? Yes! | |||
Is this model linear in x? No! | |||
==Misc== | ==Misc== |