5,323
edits
Line 146: | Line 146: | ||
A single 7x7 conv layer with C-dim input and C-dim output would need <math>49 \times C^2</math> weights. | A single 7x7 conv layer with C-dim input and C-dim output would need <math>49 \times C^2</math> weights. | ||
Three <math>3\times 3</math> conv layers only need <math>27 \times C^2</math> weights. | Three <math>3\times 3</math> conv layers only need <math>27 \times C^2</math> weights. | ||
===Network in network=== | |||
Use a small perceptron as your convolution kernel. I.e. the block goes into the perceptron. This output instead of calculating cross correlation with a standard kernel. | |||
===GoogLeNet=== | |||
Hebbian Principle: Neurons that fire together are typically wired together. | |||
Implemented using an ''Inception Module''. | |||
The key idea is to use a heterogeneous set of convolutions. | |||
Naive idea: Do a 1x1 convolution, 3x3 convolution, and 5x5 convolution and then concatenate the output together. | |||
The intuition is that each captures a different receptive field. | |||
In practice, they need to add 1x1 convolutions before the 3x3 and 5x5 convolutions to make it work. These are used for dimension reduction by controlling the channel. | |||
Another idea is to add auxiliary classifiers across the network. | |||
Inception v2, v3 | |||
V2 adds batch-normalization to reduce dependence on auxiliary classifiers. | |||
V3 addes factored convolutions (i.e. nx1 and 1xn convolutions). | |||
===ResNet=== | |||
The main idea is to introduce skip or shortcut connections. | |||
This existing in literature before. | |||
The means returning <math>F(x)+x</math>. | |||
This allow smoother gradient flows since intermediate layers cannot block gradient flow. | |||
They also replace 3x3 convolutions on 256 channels with 1x1 to 64 channels, 3x3 on the 64 channels, then 1x1 back to 256 channels. | |||
This reduces parameters from approx 600k to approx 70k. | |||
===Accuracy vs efficiency=== | |||
First we had AlexNet. Then we had VGG which had way more parameters and better accuracy. | |||
Then we had GoogLeNet which is much smaller than both AlexNet and VGG with roughly the same accuracy. | |||
Next ResNet and Inception increases the parameters slightly and attained better performance. | |||
==Will be on the exam== | ==Will be on the exam== |