Machine Learning
Machine Learning Interesting
Hyperparameters
Batch Size
A medium post empirically evaluating the effect of batch_size
Learning Rate
Learning Theory
PAC Learning
Probably Approximately Correct (PAC)
A hypothesis class \(\displaystyle H\) is PAC learnable if given \(\displaystyle 0 \lt \epsilon, \delta \lt 1\), there is some function \(\displaystyle m(\epsilon, \delta)\) polynomial in \(\displaystyle 1/\epsilon, 1/\delta\) such that if we have a sample size \(\displaystyle \geq m(\epsilon, \delta)\) then with probability \(\displaystyle 1-\delta\) the hypothesis we will learn will have an average error \(\displaystyle \leq \epsilon\).
Uniform Convergence
If for all hypothesis \(\displaystyle h\), \(\displaystyle |L_S(h)-L_D(h)| \leq \epsilon\), then the training set \(\displaystyle S\) is called \(\displaystyle \epsilon\)-representative.
Then
\(\displaystyle
L_D(h_s)
\leq L_S(h_S) + \epsilon / 2
\leq L_S(h_D) + \epsilon / 2
\leq L_D(h_D) + \epsilon
\).
A hypothesis class \(\displaystyle H\) has uniform convergence if there exists \(\displaystyle m^{UC}(\epsilon, \delta)\) such that for every \(\displaystyle \epsilon, \delta\), if we draw a sample \(\displaystyle S\) then with probability \(\displaystyle 1-\delta\), \(\displaystyle S\) is \(\displaystyle \epsilon\)-representative.