Neural Network Compression: Difference between revisions
Line 8: | Line 8: | ||
* Mozer and Smolensky (1988)<ref name="mozer1988skeletonization"></ref> use a gate for each neuron. Then the sensitivity and be estimated with the derivative w.r.t the gate. | * Mozer and Smolensky (1988)<ref name="mozer1988skeletonization"></ref> use a gate for each neuron. Then the sensitivity and be estimated with the derivative w.r.t the gate. | ||
* Karnin <ref name="karnin1990simple"></ref> estimates the sensitivity by monitoring the change in weight during training. | * Karnin <ref name="karnin1990simple"></ref> estimates the sensitivity by monitoring the change in weight during training. | ||
* LeCun ''e al.'' present ''Optimal Brain Damage'' <ref name="lecun1989optimal"></ref> | |||
==Quantization== | |||
There are many works which use 8-bit or 16-bit representations instead of the standard 32-bit floats. | |||
==Factorization== | ==Factorization== |
Revision as of 21:15, 2 February 2021
Brief survey on neural network compression techniques.
Pruning
Sensitivity Methods
The idea here is to measure how sensitive each neuron is.
I.e., if you remove the neuron, how will it change the output?
- Mozer and Smolensky (1988)[1] use a gate for each neuron. Then the sensitivity and be estimated with the derivative w.r.t the gate.
- Karnin [2] estimates the sensitivity by monitoring the change in weight during training.
- LeCun e al. present Optimal Brain Damage [3]
Quantization
There are many works which use 8-bit or 16-bit representations instead of the standard 32-bit floats.
Factorization
Libraries
Both Tensorflow and PyTorch have built in libraries for pruning:
Resources
Surveys
- Pruning algorithms a survey (1993) by Russel Reed
- A Survey of Model Compression and Acceleration for Deep Neural Networks (2017) by Cheng et al.
References
<templatestyles src="Reflist/styles.css" />
- ↑ Mozer, M. C., & Smolensky, P. (1988). Skeletonization: A technique for trimming the fat from a network via relevance assessment. (NeurIPS 1988). PDF
- ↑ Karnin, E. D. (1990). A simple procedure for pruning back-propagation trained neural networks. (IEEE TNNLS 1990). IEEE Xplore
- ↑ Cite error: Invalid
<ref>
tag; no text was provided for refs namedlecun1989optimal