Convolutional neural network: Difference between revisions
Line 35: | Line 35: | ||
==Other Types of Convolutions== | ==Other Types of Convolutions== | ||
===Transpose Convolution=== | ===Transpose Convolution=== | ||
See [https://medium.com/apache-mxnet/transposed-convolutions-explained-with-ms-excel-52d13030c7e8 Medium Post]<br> | |||
Instead of your 3x3 kernel taking 9 values as input and returning 1 value (</math>\sum_i \sum_j w_{ij} * i_{i+x,j+y}</math>), the kernel now takes 1 value and returns 9 (</math>w_{ij} * i_{x,y}</math>). | |||
===Gated Convolution=== | ===Gated Convolution=== | ||
See [http://openaccess.thecvf.com/content_ICCV_2019/html/Yu_Free-Form_Image_Inpainting_With_Gated_Convolution_ICCV_2019_paper.html Gated Convolution (ICCV 2019)]<br> | See [http://openaccess.thecvf.com/content_ICCV_2019/html/Yu_Free-Form_Image_Inpainting_With_Gated_Convolution_ICCV_2019_paper.html Gated Convolution (ICCV 2019)]<br> | ||
Given an image, we have two convolution layers <math>k_{feature}</math> and <math>k_{gate}</math>.<br> | Given an image, we have two convolution layers <math>k_{feature}</math> and <math>k_{gate}</math>.<br> | ||
The output is <math>O = \phi(k_{feature}(I)) \odot \sigma(k_{gate}(I))</math> | The output is <math>O = \phi(k_{feature}(I)) \odot \sigma(k_{gate}(I))</math> |
Revision as of 15:01, 2 March 2020
Convolutional Neural Network
Primarily used for image tasks such as computer vision or image generation,
though they can be used anywhere you have a rectangular grid with spatial relationship among your data.
Convolutions
Pytorch Convolution Layers
Here, we will explain 2d convolutions.
Suppose we have the following input image:
and the following 3x3 kernel:
For each possible position of the 3x3 kernel over the input image, we perform an element-wise multiplication (\(\displaystyle \odot\)) and sum over all entries to get a single value.
Stride
How much the kernel moves along. Typically 1 or 2.
Padding
Convolutional layers yield an output smaller than the input size. We can use padding to increase the input size.
- Types of padding
- Zero
- Mirror
Dilation
Space between pixels in the kernel
A dilation of 1 will apply a 3x3 kernel over a 5x5 region. This would be equivalent to a 5x5 kernel with odd index weights (\(\displaystyle i \% 2 == 1\)) set to 0.
Groups
Other Types of Convolutions
Transpose Convolution
See Medium Post
Instead of your 3x3 kernel taking 9 values as input and returning 1 value (</math>\sum_i \sum_j w_{ij} * i_{i+x,j+y}</math>), the kernel now takes 1 value and returns 9 (</math>w_{ij} * i_{x,y}</math>).
Gated Convolution
See Gated Convolution (ICCV 2019)
Given an image, we have two convolution layers \(\displaystyle k_{feature}\) and \(\displaystyle k_{gate}\).
The output is \(\displaystyle O = \phi(k_{feature}(I)) \odot \sigma(k_{gate}(I))\)