Visual Learning and Recognition: Difference between revisions

Visual Learning and Recognition (view source)

1,259 bytes added , 27 October 2020

5,337

edits

@@ Line 730: / Line 730: @@
 ;How to evaluate this?
 Object Keypoint Similarity (OKS) is a metric based on euclidean distance with a Gaussian error margin per keypoint based on the scale of each object.
+===Approaches for Pose Estimation===
+* Top-down approaches
+** Do person detection then do pose estimation.
+** Pros: Already have a person detector, smaller image to process in second stage.
+** Cons: Runtime proportional to number of people.
+* Bottom-up approaches
+** Detect limbs then do person
+** Pros: No dependency on person detector. Maybe single stage. Time independent of # of people.
+** Cons: Optimization for part associate is NP hard.
+===Examples===
+Pose Machines + CPM + OpenPose
+COCO Winner
+* Pose Machines ECCV 2014
+* Convolutional Pose Machines CVPR 2016
+* Realtime Multi-Person 2D
+G-RMI
+COCO Runner-up
+===Pose Machines===
+ECCV 2014
+Bottom-up approach
+Address challenges:
+* Local evidence is weak
+* Part context is a strong cue
+* Larger composite parts can be easier to detect.
+===Convolutional Pose Machines===
+===OpenPose===
+For 30 people with 17 keypoints each, we have 1.3e5 pairwise connections.
+===G-RMI (Google)===
+# Compute Heatmaps G
+# Offset Output
+# Fuse heatmaps and offsets via Hough Voting
+===Stacked Hourglass Networks===
+COCO Runner-up
+==Videos==
+* Video as space-time volume
+* Object correspondence via tracking
+* Motion for parallax, occlusions
+;Why are videos challenging?
+* Too much data.
 ==Will be on the exam==