5,337
edits
Line 1,489: | Line 1,489: | ||
In training: | In training: | ||
For the source domain, we have labeled samples <math>\{(x_i^S, y_i^S)\}_{i=1}^{m_S} \sim Q_{X,Y}</math>. | For the source domain, we have labeled samples <math>\{(x_i^S, y_i^S)\}_{i=1}^{m_S} \sim Q_{X,Y}</math>. | ||
For the target domain, we only have unlabeled samples <math>\{x_i^t\} \sim P_{X}</math>. | For the target domain, we only have unlabeled samples <math>\{x_i^t\} \sim P_{X}</math>. | ||
This is ''unsupervised'' domain adaptation. | |||
In ''semi-supervised'' domain adaptation, your target samples are mostly unlabeled but contain a few labeled samples. | |||
If no target samples are available during training, the scenario is called ''domain generalization'' or ''out-of-distribution'' generalization. | |||
===Unsupervised domain adaptation=== | |||
Given <math>m_s = m_t = m</math>. | |||
* m labeled samples from source domain Q | |||
* m unlabeled samples from target P | |||
* This problem is ill-defined | |||
;Practical assumptions. | |||
# Covariate shifts: P and Q satisfy the covariate shift assumption if the conditional label dist doesn't change between source and target. | |||
#* I.e. <math>P(y|x) = Q(y|x)</math> | |||
# Similarity of source and target marginal distributions. | |||
# If I had labeled target samples, the join error (target + source samples) should be small. | |||
[Ben-David et al.] consider binary classification. | |||
;H-divergence | |||
<math>2 sup_{h \in H}|Pr_{x \sim Q_{X}}(h(x)=1) - Pr_{x \sim P_{X}}(h(x)=1)| = d_{H}(Q_X, P_X)</math> | |||
;Lemma | |||
<math>d_{H}(Q_X, P_X)</math> can be estimated by m samples from source and target domains. | |||
<math>VC(H)=d</math> then with probability <math>1-\delta</math> | |||
<math>d_{H}(Q_X, P_X) \leq d_{H}(Q_{X}^{(m)}, P_{X}^{(m)}) + 4 \sqrt{\frac{d \log(2m) + \log(2/\delta)}{m}}</math> | |||
# Label source examples as 1 and label target samples as 0. | |||
# Train a classifier to classify source and target. | |||
# If loss is small then divergence is large. <math>\frac{1}{2} d_{H}(Q_X^{(m)}, P_X^{(m)}) = 1-loss_{class}</math> | |||
==Misc== | ==Misc== |