5,321
edits
Line 446: | Line 446: | ||
For SGD, it is easier to find ''simple'' solutions (e.g. functions with small norms). This leads to better generalization. | For SGD, it is easier to find ''simple'' solutions (e.g. functions with small norms). This leads to better generalization. | ||
===Can we analyze the double descent curve for some simple distributions or models? | ===Can we analyze the double descent curve for some simple distributions or models?=== | ||
Setup: | Setup: | ||
Our features are <math>x = (x_1,..., x_d)</math> where <math>x_i</math> are from standard normal. | Our features are <math>x = (x_1,..., x_d)</math> where <math>x_i</math> are from standard normal. |