Jump to content

Convolutional neural network: Difference between revisions

From David's Wiki
Line 28: Line 28:


===Dilation===
===Dilation===
Space between pixels in the kernel<br>
A dilation of 1 will apply a 3x3 kernel over a 5x5 region. This would be equivalent to a 5x5 kernel with odd index weights (<math> i \% 2 == 1</math>) set to 0.
===Groups===
===Groups===



Revision as of 14:57, 2 March 2020

Convolutional Neural Network
Primarily used for image tasks such as computer vision or image generation, though they can be used anywhere you have a rectangular grid with spatial relationship among your data.


Convolutions

Pytorch Convolution Layers
Here, we will explain 2d convolutions.

Suppose we have the following input image:


and the following 3x3 kernel:

For each possible position of the 3x3 kernel over the input image, we perform an element-wise multiplication (\(\displaystyle \odot\)) and sum over all entries to get a single value.

Stride

How much the kernel moves along. Typically 1 or 2.

Padding

Convolutional layers yield an output smaller than the input size. We can use padding to increase the input size.

Types of padding
  • Zero
  • Mirror

Dilation

Space between pixels in the kernel
A dilation of 1 will apply a 3x3 kernel over a 5x5 region. This would be equivalent to a 5x5 kernel with odd index weights (\(\displaystyle i \% 2 == 1\)) set to 0.

Groups

Other Types of Convolutions

Transpose Convolution

Gated Convolution

See Gated Convolution (ICCV 2019)
Given an image, we have two convolution layers \(\displaystyle k_{feature}\) and \(\displaystyle k_{gate}\).
The output is \(\displaystyle O = \phi(k_{feature}(I)) \odot \sigma(k_{gate}(I))\)