Convolutional neural network: Difference between revisions

 
(8 intermediate revisions by the same user not shown)
Line 34: Line 34:
[https://pytorch.org/docs/stable/nn.html#convolution-layers Pytorch Convolution Layers]<br>
[https://pytorch.org/docs/stable/nn.html#convolution-layers Pytorch Convolution Layers]<br>
[https://towardsdatascience.com/types-of-convolutions-in-deep-learning-717013397f4d Types of convolutions animations]<br>
[https://towardsdatascience.com/types-of-convolutions-in-deep-learning-717013397f4d Types of convolutions animations]<br>
Here, we will explain 2d convolutions.<br>
Here, we will explain 2d convolutions, also known as cross-correlation.<br>
Suppose we have the following input image:<br>
Suppose we have the following input image:<br>
<pre>
<pre>
Line 72: Line 72:
\end{bmatrix}
\end{bmatrix}
</math><br>
</math><br>
Summing up all the elements gives us <math>66</math> which would go in the first index of the output.
Summing up all the elements gives us <math>66</math> which would go in the first index of the output
Shifting the kernel over all positions of the image gives us the whole output, another 2D image.


The formula for the output resolution of a convolution is:
The formula for the output resolution of a convolution is:
Line 95: Line 96:


===Stride===
===Stride===
How much the kernel moves along. Typically 1 or 2.
How much the kernel moves. Typically 1 or 2
Moving by 2 will yield half the resolution of the input.


===Padding===
===Padding===
Line 103: Line 105:


;Common Types of padding
;Common Types of padding
* Zero/Constant
* Zero or Constant padding
* Mirror/Reflection
* Mirror/Reflection padding
* Replication
* Replication padding


With convolution layers in tensorflow and other libraries you often see these two types of padding:
With convolution layers in libraries you often see these two types of padding which can be added to the conv layer directly:
* <code>VALID</code> - Do not do any padding
* <code>VALID</code> - Do not do any padding
* <code>SAME</code> - Apply zero padding such that the output will have resolution \(\lfloor x/stride \rfloor\).
* <code>SAME</code> - Apply zero padding such that the output will have resolution \(\lfloor x/stride \rfloor\).
Line 156: Line 158:
==Spherical Images==
==Spherical Images==
There are many ways to adapt convolutional layers to spherical images.
There are many ways to adapt convolutional layers to spherical images.
Below are just a few that I've seen.


* [http://papers.nips.cc/paper/6656-learning-spherical-convolution-for-fast-features-from-360-imagery Learning Spherical Convolution for Fast Features from 360 Imagery (NIPS 2017)] proposes using different kernels with different weights and sizes for different altitudes \(\phi\).
* [http://papers.nips.cc/paper/6656-learning-spherical-convolution-for-fast-features-from-360-imagery Learning Spherical Convolution for Fast Features from 360 Imagery (NIPS 2017)] proposes using different kernels with different weights and sizes for different altitudes \(\phi\).