Deep Learning: Difference between revisions

Line 492: Line 492:


In other cases, ''prescient'' feature selection. We can include features in <math>T</math> by decreasing the order of <math>\beta_j^2 = \frac{1}{j^2}</math>. From this we get a behavior like double descent.
In other cases, ''prescient'' feature selection. We can include features in <math>T</math> by decreasing the order of <math>\beta_j^2 = \frac{1}{j^2}</math>. From this we get a behavior like double descent.
===Related Works===
Jiang ''et al.''<ref name="jiang2019generalization"></ref> provide some emperical evaluations of different generalization bounds such as:
* Sharpness-based bounds (PAC-Bayesian)
* Norm-based bounds


==Misc==
==Misc==