Batch normalization: Difference between revisions

No edit summary
No edit summary
 
(One intermediate revision by the same user not shown)
Line 7: Line 7:
* An average mean.
* An average mean.
* An average std dev.
* An average std dev.
For CNNs each of these is a vector the size of the number of channels.


During training, these two values are computed from the batch.
During training, these two values are computed from the batch.
Line 25: Line 27:
out = (t - t_mean.view(1,1,1,C)) / t_stddev.view(1,1,1,C)
out = (t - t_mean.view(1,1,1,C)) / t_stddev.view(1,1,1,C)
</pre>
</pre>


==Resources==
==Resources==
* [[Wikipedia: Batch normalization]]
* [[Wikipedia: Batch normalization]]
* [https://towardsdatascience.com/batch-normalization-in-neural-networks-1ac91516821c https://towardsdatascience.com/batch-normalization-in-neural-networks-1ac91516821c]
* [https://towardsdatascience.com/batch-normalization-in-neural-networks-1ac91516821c https://towardsdatascience.com/batch-normalization-in-neural-networks-1ac91516821c]