Depth Estimation: Difference between revisions
No edit summary |
|||
Line 1: | Line 1: | ||
Depth Estimation<br> | Depth Estimation<br> | ||
Goal: Generate an image of depths from a | Goal: Generate an image of depths from one or two images. | ||
==Background== | |||
===Depth vs. Disparity=== | |||
For stereo methods, people usually estimate pixel disparity rather than depth. | |||
That is, determining how far a pixel moves along the epipolar line between two images. | |||
Usually, this involves first rectifying an images using RANSAC or similar. | |||
Then a cost volume can be built. Then argmin is applied to the cost volume to find the best disparity estimate. | |||
Disparity is related to depth by the following formula: | |||
<math display="block">disparity = baseline * focal / depth</math> | |||
* <math display="inline">focal</math> is the focal length in pixels. This is a correction factor for the resolution of the disparity. | |||
* <math>display="inline">baseline</math> is the distance between the camera positions. This should be in the same units as your depth. | |||
==Stereo Depth== | |||
Typically people use cost-volume to estimate depth from a stereo camera setup. | |||
* [https://openaccess.thecvf.com/content_ECCV_2018/html/Sameh_Khamis_StereoNet_Guided_Hierarchical_ECCV_2018_paper.html StereoNet (ECCV 2018)] ([[StereoNet: Guided Hierarchical Refinement for Real-Time Edge-Aware Depth Prediction |My Summary]]) is a method by Google's Augmented Perception team. | |||
* [http://visual.cs.ucl.ac.uk/pubs/casual3d/ Casual 3D photography (SIGGRAPH ASIA 2017)] includes a method for refining cost volumes and a system for synthesizing views from a few dozen photos] | |||
Revision as of 14:27, 31 August 2020
Depth Estimation
Goal: Generate an image of depths from one or two images.
Background
Depth vs. Disparity
For stereo methods, people usually estimate pixel disparity rather than depth.
That is, determining how far a pixel moves along the epipolar line between two images.
Usually, this involves first rectifying an images using RANSAC or similar. Then a cost volume can be built. Then argmin is applied to the cost volume to find the best disparity estimate.
Disparity is related to depth by the following formula: \[disparity = baseline * focal / depth\]
- \(focal\) is the focal length in pixels. This is a correction factor for the resolution of the disparity.
- \(\displaystyle display="inline"\gt baseline\) is the distance between the camera positions. This should be in the same units as your depth.
Stereo Depth
Typically people use cost-volume to estimate depth from a stereo camera setup.
- StereoNet (ECCV 2018) (My Summary) is a method by Google's Augmented Perception team.
- Casual 3D photography (SIGGRAPH ASIA 2017) includes a method for refining cost volumes and a system for synthesizing views from a few dozen photos]
Depth from Motion
Depth is generated in real-time based on motion of the camera