360SD-Net: 360° Stereo Depth Estimation with Learnable Cost Volume
360SD-Net: 360° Stereo Depth Estimation with Learnable Cost Volume
Method
Their input is two equirectangular images, one taken above another.
They also input the polar angle:
# Y angle angle_y = np.array([(i-0.5)/512*180 for i in range(256, -256, -1)]) angle_ys = np.tile(angle_y[:, np.newaxis, np.newaxis], (1,1024, 1)) equi_info = angle_ys
The angles are equivalent to np.linspace(90, -90, height+1)[:-1] - 0.5*(180/height)
Architecture
Feature Extraction
Both top and bottom images are p
ASPP Module
Atrous-Spatial Pyramid Pooling
This idea comes from Chen et al.[1].
The idea here is to perform convolution over multiple scale of the input image or feature tensor.
This is performed using multiple parallel convolutions of the input, each with different dilation sizes.
Learnable Cost Volume
Dataset
They construct a dataset using Matterport3D and Stanford 3D datasets. Their constructed dataset is available upon request.
Evaluation
- ↑ Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, Alan L. Yuille, Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs https://arxiv.org/abs/1412.7062