Deep Learning: Difference between revisions
Line 492: | Line 492: | ||
In other cases, ''prescient'' feature selection. We can include features in <math>T</math> by decreasing the order of <math>\beta_j^2 = \frac{1}{j^2}</math>. From this we get a behavior like double descent. | In other cases, ''prescient'' feature selection. We can include features in <math>T</math> by decreasing the order of <math>\beta_j^2 = \frac{1}{j^2}</math>. From this we get a behavior like double descent. | ||
===Related Works=== | |||
Jiang ''et al.''<ref name="jiang2019generalization"></ref> provide some emperical evaluations of different generalization bounds such as: | |||
* Sharpness-based bounds (PAC-Bayesian) | |||
* Norm-based bounds | |||
==Misc== | ==Misc== |