Batch normalization: Difference between revisions
No edit summary |
No edit summary |
||
(One intermediate revision by the same user not shown) | |||
Line 7: | Line 7: | ||
* An average mean. | * An average mean. | ||
* An average std dev. | * An average std dev. | ||
For CNNs each of these is a vector the size of the number of channels. | |||
During training, these two values are computed from the batch. | During training, these two values are computed from the batch. | ||
Line 25: | Line 27: | ||
out = (t - t_mean.view(1,1,1,C)) / t_stddev.view(1,1,1,C) | out = (t - t_mean.view(1,1,1,C)) / t_stddev.view(1,1,1,C) | ||
</pre> | </pre> | ||
==Resources== | ==Resources== | ||
* [[Wikipedia: Batch normalization]] | * [[Wikipedia: Batch normalization]] | ||
* [https://towardsdatascience.com/batch-normalization-in-neural-networks-1ac91516821c https://towardsdatascience.com/batch-normalization-in-neural-networks-1ac91516821c] | * [https://towardsdatascience.com/batch-normalization-in-neural-networks-1ac91516821c https://towardsdatascience.com/batch-normalization-in-neural-networks-1ac91516821c] |