Visual Learning and Recognition: Difference between revisions

Visual Learning and Recognition (view source)

893 bytes added , 15 October 2020

5,337

edits

@@ Line 549: / Line 549: @@
 '''Exam Question'''
+Benefits:
+* Removes SVM part to do end-to-end training.
+* Each region is cropped from a feature map of the image rather than the raw image.
+** One con is the feature map is lower-res so small objects may become <math>1\times1</math> features.
+** '''RCNN is better for smaller objects'''
+===Faster R-CNN===
+Focuses on the region proposals by replacing selective search with a region proposal network.
+Computes region proposals on-the-fly.
+Contains the following
+# Feature extractor
+# RoI Proposal Network
+# RoI Classification & Regression Network
+;How region proposal works
+# Given an image, pass through conv filters to get feature maps.
+# Map each pixel to <math>k</math> anchor boxes.
+# Then a layer outputs foreground-background classification and another outputs bounding box regression for each pixel.
+Generally two-stage models perform better. Everything is trained end-to-end.
 ==Will be on the exam==