Monocular Neural Image Based Rendering with Continuous View Control: Difference between revisions

Line 52: Line 52:


The encoder converts images into latent points.   
The encoder converts images into latent points.   
It consists of 8 convolutional blocks which each downsample the feature map. (Note that the paper says 7 but their code uses 8)   
It consists of 8 convolutional blocks which each downsample the feature map. (Note that the supplementary material says 7 but their code actually uses 8).  
Each block is: Conv-BatchNorm-LeakyReLU.   
Each block is: Conv-BatchNorm-LeakyReLU.   
The final output of the convolution blocks has size \((1, 1, 2**8)\). 
The output of the convolutional blocks are put through a fully connected layer and reshaped into a \(200 \times 3\) matrix.
The output of the convolutional blocks are put through a fully connected layer and reshaped into a \(200 \times 3\) matrix.