Neural Fields: Difference between revisions
Line 62: | Line 62: | ||
====Latent code==== | ====Latent code==== | ||
* [https://neural-3d-video.github.io/ Neural 3D Video Synthesis (CVPR 2022)] | * [https://neural-3d-video.github.io/ Neural 3D Video Synthesis (CVPR 2022)] | ||
** Use a latent code for each time step. | |||
* [https://arxiv.org/abs/2210.15947 NeRFPlayer] | * [https://arxiv.org/abs/2210.15947 NeRFPlayer] | ||
** Use a time-dependent sliding window along the feature channels in a feature grid. | ** Use a time-dependent sliding window along the feature channels in a feature grid. |
Revision as of 15:13, 30 March 2023
Neural Fields refers to using neural networks or neural methods to represent scenes or other signals in computer vision and graphics.
Techniques
Forward Maps
Forward maps are the differentiable functions which convert the representation to an observed signal.
Shapes
- Occupancy Grids or Voxel Grids
- Signed Distance Functions
- Primary-ray (PRIF)
3D Scenes
- Radiance Fields (NeRF)
- Light Fields
Identity
- Images
Architectures
Neural Networks
- MLP
- CNN + MLP
- Progressive Architectures
Hybrid Representations
- Voxel Grids
These typically combine a octree or voxel grid with an MLP.
Some of these are basically feature grids.
- Neural Sparse Voxel Fields
- KiloNeRF
- Point Clouds
- Mesh
Feature Grids
- Plenoxels
- Plenoctrees
- Hash (Instant-NGP)
- Vector Quantization
https://nv-tlabs.github.io/vqad/
- Factorized Feature Grids
- TensoRF
Dynamic Content
Deformation
The idea here is to have an MLP which models the deformation of a canonical frame to the target frame.
Latent code
- Neural 3D Video Synthesis (CVPR 2022)
- Use a latent code for each time step.
- NeRFPlayer
- Use a time-dependent sliding window along the feature channels in a feature grid.
Time-axis
- Text-To-4D Dynamic Scene Generation (2023)
- Extends the tri-plane feature grid to a six-plane feature grid ({x, y, z, t} choose 2).
Segmentation
Segment static background and objects from dynamic background and objects
Generalization
Generalization mainly focuses on learning a prior over the distribution, similar to what existing image generation network do.
This enables more advanced vision tasks such as view synthesis from a single image, shape completion, inpainting, object generation, segmentation.
- CNN
- pixelNeRF
- Latent Codes
- Hyper Networks
- Light Field Networks
Applications
3D Generation
- EG3D - Adapting Stylegan2, NeRF, and a super-resolution network for generating 3D scenes
- Dream Fields - CLIP-guided NeRF generation
- Dreamfusion - Adapting text-to-image diffusion models to generate NeRFs