Deep Learning: Difference between revisions

Line 414: Line 414:
Example: <math>H = \{ h(x) = w^t x, \Vert w \Vert_2 \leq 1\}</math>   
Example: <math>H = \{ h(x) = w^t x, \Vert w \Vert_2 \leq 1\}</math>   
<math>R(H \circ S) \leq \frac{\max_i \Vert x^{(i)} \Vert_2}{\sqrt{n}}</math>
<math>R(H \circ S) \leq \frac{\max_i \Vert x^{(i)} \Vert_2}{\sqrt{n}}</math>
Question: What is the Rademacher complexity of a deep model?
<math>H = \{ h(x) \mid h \text{ is a NN with some structure}\}</math> 
If <math>R(H \circ S)</math> is small then by the theorem, we can have good generalization performance.


==Misc==
==Misc==