Neural Network Compression: Difference between revisions

Revision as of 20:45, 2 February 2021

Brief survey on neural network compression techniques.

The idea here is to measure how sensitive each neuron is.
I.e., if you remove the neuron, how will it change the output?

Mozer and Smolensky (1988)^[1] use a gate for each neuron. Then the sensitivity and be estimated with the derivative w.r.t the gate.
Karnin ^[2] estimates the sensitivity by monitoring the change in weight during training.

↑ Mozer, M. C., & Smolensky, P. (1988). Skeletonization: A technique for trimming the fat from a network via relevance assessment. (NeurIPS 1988). PDF
↑ Karnin, E. D. (1990). A simple procedure for pruning back-propagation trained neural networks. (IEEE TNNLS 1990). IEEE Xplore

@@ Line 7: / Line 7: @@
 * Mozer and Smolensky (1988)<ref name="mozer1988skeletonization"></ref> use a gate for each neuron. Then the sensitivity and be estimated with the derivative w.r.t the gate.
-* Karnin estimates the sensitivity by monitoring the change in weight during training.
+* Karnin <ref name="karnin1990simple"></ref> estimates the sensitivity by monitoring the change in weight during training.
 ==Factorization==
@@ Line 20: / Line 20: @@
 {{reflist|refs=
 <ref name="mozer1988skeletonization">Mozer, M. C., & Smolensky, P. (1988). Skeletonization: A technique for trimming the fat from a network via relevance assessment. (NeurIPS 1988). [https://proceedings.neurips.cc/paper/1988/file/07e1cd7dca89a1678042477183b7ac3f-Paper.pdf PDF]</ref>
+<ref name="karnin1990simple">Karnin, E. D. (1990). A simple procedure for pruning back-propagation trained neural networks. (IEEE TNNLS 1990). [https://ieeexplore.ieee.org/document/80236 IEEE Xplore]</ref>
 }}