5,337
edits
Line 426: | Line 426: | ||
This shows that Rademacher complexity and VC-dimension are not useful for explaining generalization for neural networks. | This shows that Rademacher complexity and VC-dimension are not useful for explaining generalization for neural networks. | ||
===Theorem=== | ===Universal Approximation Theorem=== | ||
There exists a two-layer NN with Relu activations and <math>2n+d</math> parameters that can represent any function on a sample size <math>n</math> in d dimensions. | There exists a two-layer NN with Relu activations and <math>2n+d</math> parameters that can represent any function on a sample size <math>n</math> in d dimensions. | ||