Deep Blending for Free-Viewpoint Image-Based-Rendering

Deep Blending for Free-Viewpoint Image-Based-Rendering (Siggraph Asia 2018)

Method

Below is the pipeline for their system. Note that they train the CNN separately than their proposed pipeline.

Off-line Scene Processing

In this step, they perform the following:

Structure from Motion^[1] Registration to calibrate the cameras (i.e. get the extrinsics/pose)
Multiview Stereo Reconstruction (MVS) to generate per-view depth maps and per-view meshes using two methods.
- COLMAP which provides fine details but a sparser reconstruction
- Delauney tetrahedralization (RealityCapture 2016) which provides more completeness and a smoother estimate
Geometry Refinement
Mesh Simplification

Off-line CNN Training

The goal of the CNN is to generate a sharp temporally consistent image by blending multiple estimates.

The CNN is trained via hold-out.

On-line Pipeline

Given a novel viewpoint, create a voxel grid where each voxel contains indices of per-view mesh triangles.
Generate a global mesh render from the novel viewpoint.
Use InsideOut to create 4 mosaics, warped input views.
Input the global mesh render and mosaics into the deep blending CNN.
Blend the mosaics and the global mesh render.

Architecture

The architecture they use is a U-Net with a fixed set of inputs.

Evaluation

Resources

References

↑ Johannes L. Schönberger ; Jan-Michael Frahm, Structure-from-Motion Revisited (CVPR 2016) DOI:10.1109/CVPR.2016.445

Retrieved from "https://wiki.davidl.me/index.php?title=Deep_Blending_for_Free-Viewpoint_Image-Based-Rendering&oldid=3906"