Jump to content

Deep Learning: Difference between revisions

1,167 bytes added ,  29 October 2020
Line 1,511: Line 1,511:
[Ben-David et al.] consider binary classification.   
[Ben-David et al.] consider binary classification.   
;H-divergence
;H-divergence
<math>2 sup_{h \in H}|Pr_{x \sim Q_{X}}(h(x)=1) - Pr_{x \sim P_{X}}(h(x)=1)| = d_{H}(Q_X, P_X)</math>
<math>2 \sup_{h \in H}|Pr_{x \sim Q_{X}}(h(x)=1) - Pr_{x \sim P_{X}}(h(x)=1)| = d_{H}(Q_X, P_X)</math>


;Lemma
;Lemma
Line 1,521: Line 1,521:
# Train a classifier to classify source and target.  
# Train a classifier to classify source and target.  
# If loss is small then divergence is large. <math>\frac{1}{2} d_{H}(Q_X^{(m)}, P_X^{(m)}) = 1-loss_{class}</math>
# If loss is small then divergence is large. <math>\frac{1}{2} d_{H}(Q_X^{(m)}, P_X^{(m)}) = 1-loss_{class}</math>
===Recap===
Beginning of Lecture 18 (Oct. 29, 2020)
Given labeled examples from the source domain: <math>Q_{X,Y} = \{(x_i^S, y_i^S)\}_{i=1}^{m_s}</math>. 
Target domain: <math>P_{X} = \{x_i^t\}_{i=1}^{m_t}</math>. 
Learn a function <math>h \in H</math>. 
<math>\epsilon^T(h) = E_{(x,y) \sim P_{X,Y}}[ l(h(x), y) ]</math>.
H-divergence: 
<math>d_H(Q_X, P_X) = 2\sup_{h \in H} | P_{Q}(h(x)=1) - P_{P}(h(x)=1)| = 2(1-loss_{classifier})</math>. 
This can be estimated by training a classifier to distinguish between train and test samples.
Def: 
For the hypothesis class H, the ''symmetric difference hypothesis space'' <math>H \triangle H</math> is the set of disagreements between any two hypothesis in H: 
<math>H \triangle H = \{g(x) = h(x) \oplus h'(x) \mid \forall h, h' \in H\}</math>.
;Main Result
<math>H</math> is a hypothesis class with <math>VC(H)=d</math>. 
With probability <math>1-\delta</math>, <math>\forall h \in H</math>: 
<math>\epsilon_{T}(h) \leq \epsilon_{S}(h) + \frac{1}{2} d_{H \triangle H}(Q_X^{(m)}, P_X^{(m)}) + \epsilon_{joint}</math>. 
Target error is <= source error + divergence
===Practical Domain Adaptation Methods===


==Misc==
==Misc==