5,337
edits
(→Method) |
No edit summary |
||
Line 5: | Line 5: | ||
==Method== | ==Method== | ||
Below is the pipeline for their system | Below is the pipeline for their system. Note that they train the CNN separately than their proposed pipeline. | ||
===Off-line Scene Processing=== | ===Off-line Scene Processing=== | ||
In this step, they perform the following: | In this step, they perform the following: | ||
* Structure from Motion Registration to calibrate the cameras (i.e. get the extrinsics/pose) | * Structure from Motion<cite><ref name="schonberger2016sfm">Johannes L. Schönberger ; Jan-Michael Frahm, Structure-from-Motion Revisited (CVPR 2016) [https://doi.org/10.1109/CVPR.2016.445 DOI:10.1109/CVPR.2016.445]</ref></cite> Registration to calibrate the cameras (i.e. get the extrinsics/pose) | ||
* | * Multiview Stereo Reconstruction (MVS) to generate per-view depth maps and per-view meshes using two methods. | ||
** COLMAP which provides fine details but a sparser reconstruction | |||
** Delauney tetrahedralization (RealityCapture 2016) which provides more completeness and a smoother estimate | |||
* Geometry Refinement | |||
* Mesh Simplification | |||
===Off-line CNN Training=== | |||
The goal of the CNN is to generate a sharp temporally consistent image by blending multiple estimates. | |||
The CNN is trained via hold-out. | |||
===On-line Pipeline=== | |||
* Given a novel viewpoint, create a voxel grid where each voxel contains indices of per-view mesh triangles. | |||
* Generate a global mesh render from the novel viewpoint. | |||
* Use InsideOut to create 4 ''mosaics'', warped input views. | |||
* Input the global mesh render and mosaics into the deep blending CNN. | |||
* Blend the mosaics and the global mesh render. | |||
==Architecture== | ==Architecture== | ||
The architecture they use is a U-Net with a fixed set of inputs. | |||
==Evaluation== | ==Evaluation== | ||
==Resources== | ==Resources== | ||
==References== | ==References== |