Machine Learning

Machine Learning Interesting

Hyperparameters

Batch Size

A medium post empirically evaluating the effect of batch_size

Learning Rate

Learning Theory

PAC Learning

Probably Approximately Correct (PAC)
A hypothesis class \(\displaystyle H\) is PAC learnable if given \(\displaystyle 0 \lt \epsilon, \delta \lt 1\), there is some function \(\displaystyle m(\epsilon, \delta)\) polynomial in \(\displaystyle 1/\epsilon, 1/\delta\) such that if we have a sample size \(\displaystyle \geq m(\epsilon, \delta)\) then with probability \(\displaystyle 1-\delta\) the hypothesis we will learn will have an average error \(\displaystyle \leq \epsilon\).

Uniform Convergence

If for all hypothesis \(\displaystyle h\), \(\displaystyle |L_S(h)-L_D(h)| \leq \epsilon\), then the training set \(\displaystyle S\) is called \(\displaystyle \epsilon\)-representative.
Then \(\displaystyle L_D(h_s) \leq L_S(h_S) + \epsilon / 2 \leq L_S(h_D) + \epsilon / 2 \leq L_D(h_D) + \epsilon \).
A hypothesis class \(\displaystyle H\) has uniform convergence if there exists \(\displaystyle m^{UC}(\epsilon, \delta)\) such that for every \(\displaystyle \epsilon, \delta\), if we draw a sample \(\displaystyle S\) then with probability \(\displaystyle 1-\delta\), \(\displaystyle S\) is \(\displaystyle \epsilon\)-representative.