Neural Network Compression: Difference between revisions

From David's Wiki
No edit summary
Line 15: Line 15:
* [https://axon.cs.byu.edu/~martinez/classes/678/Papers/Reed_PruningSurvey.pdf Pruning algorithms a survey] (1993) by Russel Reed
* [https://axon.cs.byu.edu/~martinez/classes/678/Papers/Reed_PruningSurvey.pdf Pruning algorithms a survey] (1993) by Russel Reed
* [https://arxiv.org/pdf/1710.09282.pdf A Survey of Model Compression and Acceleration for Deep Neural Networks] (2017) by Cheng et al.
* [https://arxiv.org/pdf/1710.09282.pdf A Survey of Model Compression and Acceleration for Deep Neural Networks] (2017) by Cheng et al.
==References==
{{reflist|refs=
<ref name="mozer1988skeletonization">Mozer, M. C., & Smolensky, P. (1988). Skeletonization: A technique for trimming the fat from a network via relevance assessment. (NeurIPS 1988). [https://proceedings.neurips.cc/paper/1988/file/07e1cd7dca89a1678042477183b7ac3f-Paper.pdf PDF]</ref>
}}

Revision as of 20:40, 2 February 2021

Brief survey on neural network compression techniques.

Pruning

Sensitivity Methods

The idea here is to measure how sensitive each neuron is.
I.e., if you remove the neuron, how will it change the output?

  • Mozer and Smolensky (1988)[1] use a gate for each neuron. Then the sensitivity and be estimated with the derivative w.r.t the gate.

Factorization

Resources

Surveys

References

  1. Mozer, M. C., & Smolensky, P. (1988). Skeletonization: A technique for trimming the fat from a network via relevance assessment. (NeurIPS 1988). PDF