Learning Independent Object Motion from Unlabelled Stereoscopic Videos: Difference between revisions

Latest revision as of 19:57, 3 June 2020

Learning Independent Object Motion from Unlabelled Stereoscopic Videos (CVPR 2019)

Authors: Zhe Cao, Abhishek Kar, Christian Haene, Jitendra Malik
Affiliations: UC Berkeley, Fyusion Inc, Google

Image pairs \(\{(I_1^l, I_1^r),..., (I_n^l, I_n^r)\}\) from unlabelled stereo videos
Object bounding boxes \(B = \{B^1,..., B^j\}\) on the left image \(I_t^l\) from off-the-shelf object detectors

For each region of interest RoI, predict a per-object flow map using a RCNN
- Also predict a object mask for each RoI
Construct a full 3D scene flow map using the per-object flow maps.

View Synthesis
Geometric consistency: The depth values of the warped image and the reference image should match

@@ Line 1: / Line 1: @@
 Learning Independent Object Motion from Unlabelled Stereoscopic Videos (CVPR 2019)
 Authors: Zhe Cao, Abhishek Kar, Christian Haene, Jitendra Malik
+Affiliations: UC Berkeley, Fyusion Inc, Google
 * [https://people.eecs.berkeley.edu/~zhecao/sceneflow/ Website]
@@ Line 7: / Line 8: @@
 ==Method==
+;Key Contributions
+* Learning with limited supervision
+* Factoring the scene into independent moving objects (main idea of the paper)
+* Designing a network architecture using place sweep volumes
+;Inputs:
+* Image pairs \(\{(I_1^l, I_1^r),..., (I_n^l, I_n^r)\}\) from unlabelled stereo videos
+* Object bounding boxes \(B = \{B^1,..., B^j\}\) on the left image \(I_t^l\) from off-the-shelf object detectors
+;Goal/Outputs:
+* Dense depth map \(D\)
+* 3D flow fields \(F = \{F^1,..., F^j\}\)
+* Instance masks \(M=\{M^1,..., M^j\}\)
+* For each region of interest RoI, predict a per-object flow map using a RCNN
+** Also predict a object mask for each RoI
+* Construct a full 3D scene flow map using the per-object flow maps.
+===Self Supervision and Loss Functions===
+* View Synthesis
+* Geometric consistency: The depth values of the warped image and the reference image should match
+* Left Right consistency \(L^{lr}\)
+* RoI Loss \(L^{roi}\)
+* Full image based loss \(L^{t}\)
 ==Architecture==