Visual Learning and Recognition: Difference between revisions

Line 730: Line 730:
;How to evaluate this?
;How to evaluate this?
Object Keypoint Similarity (OKS) is a metric based on euclidean distance with a Gaussian error margin per keypoint based on the scale of each object.
Object Keypoint Similarity (OKS) is a metric based on euclidean distance with a Gaussian error margin per keypoint based on the scale of each object.
===Approaches for Pose Estimation===
* Top-down approaches
** Do person detection then do pose estimation.
** Pros: Already have a person detector, smaller image to process in second stage.
** Cons: Runtime proportional to number of people.
* Bottom-up approaches
** Detect limbs then do person
** Pros: No dependency on person detector. Maybe single stage. Time independent of # of people.
** Cons: Optimization for part associate is NP hard.
===Examples===
Pose Machines + CPM + OpenPose 
2016 COCO Winner 
* Pose Machines ECCV 2014
* Convolutional Pose Machines CVPR 2016
* Realtime Multi-Person 2D
G-RMI
2016 COCO Runner-up 
===Pose Machines===
ECCV 2014 
Bottom-up approach
Address challenges:
* Local evidence is weak
* Part context is a strong cue
* Larger composite parts can be easier to detect.
===Convolutional Pose Machines===
===OpenPose===
For 30 people with 17 keypoints each, we have 1.3e5 pairwise connections.
===G-RMI (Google)===
# Compute Heatmaps G
# Offset Output
# Fuse heatmaps and offsets via Hough Voting 
===Stacked Hourglass Networks===
2017 COCO Runner-up
==Videos==
* Video as space-time volume
* Object correspondence via tracking
* Motion for parallax, occlusions
;Why are videos challenging?
* Too much data.


==Will be on the exam==
==Will be on the exam==