NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
NeRF Representing Scenes as Neural Radiance Fields for View Synthesis (ECCV 2020)
- Authors: Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, Ren Ng
- Affiliations: UC Berkeley, Google Research, UC San Diego
The main idea is to do volume rendering through an MLP.
As volume rendering is differentiable, the MLP will learn color and density at each position.
Training is supervised on multiple posed static images (e.g. 25 images). Poses can be obtained using COLMAP.
Successful execution relies on two tricks:
- A positional encoding.
- Hierarchical volume sampling
Method
Volume Rendering
Positional Encoding
This is further explored in a followup paper[1].
Hierarchical volume sampling
This idea is to uses a course network to influence sampling of the fine network.
Architecture
They use a standard MLP with extra inputs.
Their MLP has 9 layers with 256-dim features and one layer with 128-dim features (~600k parameters?).
See their paper and code for more details.
Experiments
They compare with their past work on Local Light Field Fusion as well as a few other view synthesis papers.
They also publish a dataset available on their Google Drive.
Followup Work
Positional Encoding
Videos
- Deformable Neural Radiance Fields
- Neural Irradiance Fields for Free-Viewpoint Video
- Neural 3D Video Synthesis from Multi-view Video (CVPR 2022
References
<templatestyles src="Reflist/styles.css" />