5,337
edits
Line 52: | Line 52: | ||
The encoder converts images into latent points. | The encoder converts images into latent points. | ||
It consists of 8 convolutional blocks which each downsample the feature map. (Note that the | It consists of 8 convolutional blocks which each downsample the feature map. (Note that the supplementary material says 7 but their code actually uses 8). | ||
Each block is: Conv-BatchNorm-LeakyReLU. | Each block is: Conv-BatchNorm-LeakyReLU. | ||
The final output of the convolution blocks has size \((1, 1, 2**8)\). | |||
The output of the convolutional blocks are put through a fully connected layer and reshaped into a \(200 \times 3\) matrix. | The output of the convolutional blocks are put through a fully connected layer and reshaped into a \(200 \times 3\) matrix. | ||