Visual Learning and Recognition: Difference between revisions
| (4 intermediate revisions by the same user not shown) | |||
| Line 1,200: | Line 1,200: | ||
===Flow-based Models=== | ===Flow-based Models=== | ||
Flow-based models minimize the negative log-likelihood. | Flow-based models minimize the negative log-likelihood. | ||
==Attribute-based Representation== | |||
;Motivation | |||
Typically in recognition, we only predict the class of the image. | |||
From the category, we can guess the attributes but the category provides only limited information. | |||
The network cannot perform prediction on unseen new classes. | |||
This problem used to be called ''graceful degradation''. | |||
;Goal | |||
Learn intermediate structure with object categories. | |||
;Should we care about attributes in DL? | |||
;Why is attributes not simply supervised recognition? | |||
;Benefits | |||
* Dealing with inevitable failure. | |||
* We can infer things about unseen categories. | |||
* We can make comparison between objects or categories. | |||
;Datasets | |||
* a-Pascal | |||
* a-Yahoo | |||
* CORE | |||
* COCO Attributes | |||
Deep networks should be able to learn attributes implicitly. | |||
However, you don't know if it has actually learned them. | |||
==Extra Topics== | |||
===Fine-grained Recognition=== | |||
===Few-shot Recognition=== | |||
* Metric learning methods | |||
* Meta-learning methods | |||
* Data Augmentation Methods | |||
* Semantics | |||
===Zero-shot Recognition=== | |||
Goal is train a classifier without having seen a single labeled example. | |||
The information comes from a knowledge graph e.g. from word embeddings. | |||
===Beyond Labelled Datasets=== | |||
* Semi-supervised: We have both labelled and unlabeled training samples. | |||
* Weakly-supervised: The labels are weak, noisy, and non-necessarily for the task we want. | |||
* Learning from the Web: Download data from the internet | |||
==Will be on the exam== | ==Will be on the exam== | ||
| Line 1,237: | Line 1,281: | ||
* Challenges | * Challenges | ||
* What methods worked and didn't work. | * What methods worked and didn't work. | ||
==References== | ==References== | ||
| Line 1,245: | Line 1,286: | ||
<ref name="torralba2008tinyimages">Antonio Torralba, Rob Fergus and William T. Freeman (2008). 80 million tiny images: a large dataset for | <ref name="torralba2008tinyimages">Antonio Torralba, Rob Fergus and William T. Freeman (2008). 80 million tiny images: a large dataset for | ||
non-parametric object and scene recognition (PAMI 2008) [https://people.csail.mit.edu/torralba/publications/80millionImages.pdf Link]</ref> | non-parametric object and scene recognition (PAMI 2008) [https://people.csail.mit.edu/torralba/publications/80millionImages.pdf Link]</ref> | ||
<ref name="standing1973learning">Lionel Standing (1973). Learning 10000 pictures. ''Journal | <ref name="standing1973learning">Lionel Standing (1973). Learning 10000 pictures. ''Journal Quarterly Journal of Experimental Psychology'' [https://www.tandfonline.com/doi/abs/10.1080/14640747308400340 Link]</ref> | ||
Quarterly Journal of Experimental Psychology'' [https://www.tandfonline.com/doi/abs/10.1080/14640747308400340 Link]</ref> | |||
<ref name="brady2008visual">Timothy F. Brady, Talia Konkle, George A. Alvarez, and Aude Oliva (2008). Visual long-term memory has a massive storage capacity for object details. [http://olivalab.mit.edu/MM/pdfs/BradyKonkleAlvarezOliva2008.pdf Link].</ref> | <ref name="brady2008visual">Timothy F. Brady, Talia Konkle, George A. Alvarez, and Aude Oliva (2008). Visual long-term memory has a massive storage capacity for object details. [http://olivalab.mit.edu/MM/pdfs/BradyKonkleAlvarezOliva2008.pdf Link].</ref> | ||
<ref name="torralba2011unbiased>Antonio Torralba, Alexei A. Efros (2011). Unbiased Look at Dataset Bias (CVPR 2011) [https://people.csail.mit.edu/torralba/publications/datasets_cvpr11.pdf Link]</ref> | <ref name="torralba2011unbiased>Antonio Torralba, Alexei A. Efros (2011). Unbiased Look at Dataset Bias (CVPR 2011) [https://people.csail.mit.edu/torralba/publications/datasets_cvpr11.pdf Link]</ref> | ||